The computer backup plan
By ABDUL GHAFFAR MEMON
CEO, KalSoft (Pvt) Ltd
Nov 26 - Dec 02, 2001
In business we are using Computers for data processing facilities, wherever we use this data processing facilities we have to consider Backup and contingency planning. Recently heavy emphasis has been placed on this issue.
Let us start discussing on it. First of all we have business. If the business is running successfully without interruption we are not bothered. Specially if the business is in the private sector. I had represented the largest privatised bank in Pakistan and currently heading a leading software house of Pakistan. I will talk more about the backup and contingency with special reference to on-line banking system. The interuption however could be tolerable in certain industries for instance in the manufacturing sector. Interruption could be just ordinary level problems but in the public sector the interruption are not bearable after some duration.
If the duration and its impact (suspension) on the business activities or if the facilities required to run the business smoothly and efficiently is larger, the adverse will be its impact. In some cases if the entire facilities to operate the Branch Banking Activities System is disturbed, the Bank has to close its doors if the facilities cannot be restored within a reasonable period. Even if it is not closed for ever, the larger duration creates more loss and more problems.
Take the case of current scenario in Pakistan, Afghan crises is going on and several business has suffered. A lot of foreign investors have flee from here and many foreign business dependent companies had to shut off their operations. Recently Pakistan was facing regular strikes and entire economy suffered it. Several buildings were burnt and destroyed. In a place where social factors are very unpredictable and situations lead to violent acts or terrorism, the need for back up is most crucial.
Why interruptions occur: When we are talking of interruption let us consider why interruption occurs. First of all the interruption occurs due to poor facilities like black out or interruptions in the supply of electricity. This is very frequent in Pakistan. Again the disturbance can be malfunction or not functioning of certain equipment, or telecommunication line.
Several types of Viruses are major threats which could destroy important data, in fact the software and eventually harm their usability. Sometimes there could be disturbances due to natural disasters such as earthquake, floods and so on and so forth. In case of earthquake and floods, sometime the entire area where the facilities are situated is not assessable. The other reasons for disturbance could be sabotage by our own employees or computing venders.
Sometimes this creates lot more problem than expected. At present in many areas sabotage and arson activities of the people opposed to the state or certain ideas creates lot of disturbance. In such cases they do not look at who is going to be affected and to what extent.
Ways of data back up: For all these reasons, we face this problem, we have to plan for contingency and keep a back up.
One form of keeping a back up is to provide redundancy in the equipment and other tools. Such as than if we require four workstations or terminals we provide for five. If one telephone line for the data communicate is sufficient to meet the requirement, we keep a provision for another one. Similarly we keep redundancy in the disk storage and utilize the RAID technology. These things are very common in our part of the world. But the real back up is to be created.
Back up site preparation: We have to put up the equipment and other facilities for creating the backup and for this we have to prepare site. The site is to be prepared, keeping in view all the aspects specially integration and ease of operation. The site can be prepared in the same building in which the operational machine is installed. It is normally separated by a wall. Site can be prepared within the same block of building or can be prepared at few kilometers away from the original site.
The selection of site should be carefully done because it provides lot of benefits and also creates lot of problems. If it is prepared in the same building, there will be lot of problems if because of fire or disaster due to natural calamity like earthquake or flood, the access to the building will not be possible.
If the site is prepared in another building or within the block of the buildings in the same area the problem will be faced if access to the area will not be possible or allowed due to any natural disaster. On the other hand the presence of the site in the same building has lot of advantages as well. You do not have to keep more staff to look after the facilities. It is very easy to test out the plans. The communication facilities and back up of data will be more easy. Lot of facilities can be used commonly by both sites.
If we are planning the site in the next building, then we have to maintain that site with proper staff. We have to take care of the communication facilities. We have to take care of the power facilities. If the site is prepared few kilometer away from the original site the problems are different. We have to take care of the maintenance of site, staffing of that site and we have to take special care of the communication facilities, switching of data communication and voice communication lines. Similarly updating of the data at the back up site will also be more problematic if the back up site is few kilometers away from the original one.
The cost of building site will also vary with the decision of where we want to keep the site.
The back up of the equipment: We have to keep the back up of the DP equipment i.e. Main Processor, Work stations, links and other peripherals. Depending on the selection of site the equipment can be selected. If the back up site is in the same building, we can plan utilizing the most of the existing Workstation, disk spaces and so on. But if the site is few kilometers away, it will be difficult to utilize the some workstations and disk drives. So the additional workstation and disk space will have to be provided for. If we are not maintaining a separate site for the reason of keeping another processor as a back up unit could be sufficient to handle the emergency. But again as mentioned earlier, if the building in which these processors are situated is not accessible, the provision of another processor as backup unit will not provide any benefit.
Electrical equipment: Similarly the backup of electricity is to be provided in the form of electrical generator and UPS or Battery Backups. If the sites are separate we do require separate UPS and generator. Again depending on the business process and criticality of the processing of job we can always think of not providing UPS at the backup site. If we are utilizing the some building to house another site, the electrical equipment located in the building may be sufficient to cater to the requirements of the alternate site.
Communication equipment: Communication equipment provide the access to the Data Processing Facilities at different locations. Telephone lines are used to communicate the data. For this normally, a switch is installed at the main site. If we are using the same building to house the second site, the same communication switch can be used. If we are selecting the site in next building or we are building a site few kilometers away than we have to take care of the communication facilities. We have to install UPS and backup Generator at the original site to cater to the situations of Black outs and Brown outs. Similarly we have to provide spare communication lines and communication equipment on the original site to meet the non-availability of same. For this reasons spare modems and spare leased lines can be used.
If we are selecting the site which is few kilometers away, we have to pay special attention to the Communication side. We will require similar set or subset of equipment at the backup site if we plan to provide the similar services. If we plan to provide reduced set of services then we require reduced set of Communication equipment at backup site. Similarly we will require a better Communication facilities at both sites for updating of data. We require to keep the data at both sides at the same status, we require frequent updating. If the operation of the back site is to be done within few minutes than we have to plan in such a way that Data Base of the Backup site is also updated in real time mode. If we can afford few hours interruptions, than we do not require such heavy investment on the Communication equipment and facilities. Then we can work out to take backup from old equipment and restore the same on new machine.
We adopt a procedure whereby backup is taken every hour or so. In this case we have hourly backup which can be restored in shortest possible time to new site. In the meantime we have to take care of the transactions taken place during the last hours upto the time of backup of the data at original site.
The data is processed on line or in batch mode. Some time updating is done in real time mode. Depending on the requirements we have to update the data at backup site in real time or in batch mode. Even some time, updating is not done at alternate site. Only backup is taken at regular interval at original site. The backup is restored at alternate site when the alternate site is being made operational due to certain emergency.
The cost will escalate once we plan to update the data from batch mode to real time mode. We have to allocate the personnel. There should be, a thin attendance of personnel at the backup site.
The requirements of the personnel will vary with the distance of backup site from the original operational site. We have to appoint staff to handle the logistics. We have to keep certain staff on standby duties. There should be additional staff that will be required to handle the logistics, transportation and executives of extra activities. For this purpose we have to work out the logistics. We have to appoint a Commander who should prepare the plan. In that plan every thing should be well defined.
Another suggestion is that we can utilize Internet and Intranet Utilities. The network is needed to be developed whereby the companies can avail the facility of storing data on web. Internet provides certain storage facilities for data back up on international servers for easy storage and access. VPNs or Virtual Private Networks can be utilized for moving heavy data from one place to another.VPNs are highly secure mode of transferring data since data can be sent after encrypting it and on the receival and it gets decrypt by using certain keys or passwords. This online medium is very cheap and currently most favoured in the world.
Defining responsobilities: The responsibilities of every one should be defined. It should be defined who will initiate the task and which activity to be carried out and how to detect. For example, it should be defined that in case of emergency who will be notified and based on that the backup site will become original operational site. The thin number of staff who is regularly handling the operations of the backup site should be strengthened. The Data Base Administrator, System Administrator Operator will be informed. A separate set of staff who will be looking after the transportation facility of staff to backup site will be informed. The people connected with the Communication will be informed immediately to make necessary changes in the switch.
Need to be tested: The planning is not only to be done but it is to be tested also. There should be regular testing. The regular testing removes lot of problem which should be overcome. Every one should be notified. The concerned staff should be notified of the availability of senior staff, Operational staff, the problem shooter or the implementation staff about their destination.
It should be notified that who will be available, where and at what time. The staff should know that particular Manager is available at particular address and at particular telephone numbers. Similarly the customer should know that the Sales Representative, or the Manager is available at a particular place and at particular telephone number. The backup site should contain all the facilities including telephones and should become operational within the shortest possible time.
The cost of planning and operating the backup site varies with the objective. Besides actual cost there are lot of hidden costs involved in it. The cost escalates as we move the Backup site away from the original operational site. We have to not look for the rental of the space but we have to cater to for the duplication of every equipment and staff.
If the cost becomes too high the management will not sanction so the plan should be made in such a way that at the initial starting time of the backup plan the cost is minimum and with the additional cost, additional facilities can be added.
Practical approach towards of the problem: Now I come to the practical aspect of the problem. The other backup arrangements that is practical on our part of the world is normally on reciprocal arrangements. In that case two companies having Data Processing Facilities agree to spare Computer time for the problem facing company. The company where DP equipment is not operational can utilize the DP equipment of other company at nominal cost. This is normally applicable where batch processing and batch updating is carried on. This is very economical and the cost of the entire arrangement is the lowest.
The other arrangements available mostly in the West, but is not available, at preset in Pakistan is that Data Processing and Communication Equipment on a truck and keep it mobile. Where there is problem at the original site, the facilities on the mobile can be used. This reduces lot of expenses. This creates lot of problems, specially on communication side.
The other arrangement is that there is a contact with the vendor of the computer equipment that in case of problem, the vendor will provide the backup equipment at an agreed sum. This is practice in Pakistan and the vendors often take this responsibilities for the reason of Marketing. This often creates problem as vendors do not have the machine exactly matching that of customer and in the event of disaster lot of extra work is required to be done.
The other approach that is not practice in Pakistan but I would like to advocate it. It is pooling of resources and few companies having data processing facilities prepare one common backup site which is available to the company which is affected by the disaster. This approach reduces the cost.
But to be effective the companies forming consortia should have the same vendor machine and identical equipment. The operating software should be the same. The companies should not be based in the same building blocks. Otherwise there will be lot of problem. Specially in case of the natural disaster like earthquake or flood the problem will be aggravated if all companies are in the same area.
Normally we see little attention is focused by the management on backup facilities. I saw that happening in Banks. At a branch which is operating normally on ledgers maintained by the Bank staff. I see there is no concept of backup. It is difficult to keep backup of manual ledgers.
This is because the cost of maintaining backup will be almost double the expenditure of maintaining the original record. So this is not practice and is not objected by the Internal or Ex-Internal auditors.
Similarly it is very difficult to quantify the benefits of keeping backup. Without benefits it is very difficult to justify the project. Normal attitude is that this will not happen to us. But when it happens, it hurts most. I have seen a fire in the building of Bank I previously worked for and it was in the same building quite adjacent to the Computer Centre.
It was another department. For this reason for two days the DP facilities were not accessible. No damage was done to equipment or data in the Computer Centre.
I experienced a fire in another Bank in Karachi. The fire broke out. The fire did not damage the equipment. But lot of problems were created as the facilities of Computer Centre were effected. In both cases the backup of data was available. The Computer was on and running in few hours time but due to lack of facilities the processing could not take place for couple of days.
So far in Pakistan we have not used to backup site or system. But we are planning. Awareness is there. In the bank where I last worked we planned for backup. The contingency planning has been carried out in way that even if the Head Office of Bank is blown out, the Executive of the Bank can start working from backup site within shortest possible time i.e. couple of hours. So far we have not implemented or tested the plan. But we have started working on it.