Peter M. Curtis - Maintaining Mission Critical Systems in a 24/7 Environment

Здесь есть возможность читать онлайн «Peter M. Curtis - Maintaining Mission Critical Systems in a 24/7 Environment» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Maintaining Mission Critical Systems in a 24/7 Environment: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Maintaining Mission Critical Systems in a 24/7 Environment»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

The new edition of the leading single-volume resource on designing, operating, and managing mission critical infrastructure Maintaining Mission Critical Systems Bridging engineering, operations, technology, and training, this comprehensive volume covers each component of specialized systems used in mission critical infrastructures worldwide. Throughout the text, readers are provided the up-to-date information necessary to design and analyze mission critical systems, reduce risk, comply with current policies and regulations, and maintain an appropriate level of reliability based on a facility's risk tolerance. Topics include safety, fire protection, energy security, and the myriad challenges and issues facing industry engineers today. Emphasizing business resiliency, data center efficiency, cyber security, and green power technology, this important volume:
Features new and updated content throughout, including new chapters on energy security and on integrating cleaner and more efficient energy into mission critical applications Defines power quality terminology and explains the causes and effects of power disturbances Provides in-depth explanations of each component of mission critical systems, including standby generators, raised access floors, automatic transfer switches, uninterruptible power supplies, and data center cooling and fuel systems Contains in-depth discussion of the evolution and future of the mission critical facilities industry Includes PowerPoint presentations with voiceovers and a digital/video library of information relevant to the mission critical industry
in a 24/7 Environment is a must-read reference and training guide for architects, property managers, building engineers, IT professionals, data center personnel, electrical & mechanical technicians, students, and others involved with all types of mission critical equipment.

Maintaining Mission Critical Systems in a 24/7 Environment — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Maintaining Mission Critical Systems in a 24/7 Environment», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема
Сбросить

Интервал:

Закладка:

Сделать

1.2 Risk Assessment

Critical industries require an extraordinary degree of planning and assessing. It is important to identify the best strategies to reach the targeted level of reliability. In order to design a critical building with the appropriate level of reliability, the cost of downtime and the associated risks need to be assessed. It is important to understand that downtime occurs due to more than one type of failure: design failure, catastrophic failures, equipment failures or failures due to human error. Each type of failure will require a different approach on prevention. A solid and realistic approach to business resiliency must be a priority, especially because the present critical infrastructure is inevitably designed with all the eggs located in one basket.

Within the banking and financial services, planning the critical area places considerable pressure on designing an infrastructure that evolves in an effort to support continuous business growth. Routine maintenance and upgrading equipment alone do not ensure continuous availability. The 24/7 operation of such service means an absence of scheduled interruptions for any reason, including routine maintenance, modifications, and upgrades. The main question is how and why infrastructure failures occur. Employing new methods of distributing critical power, understanding capital constraints, and developing processes that minimize human error are some key factors in improving recovery time in the event critical systems are impacted by base‐building failures.

The infrastructure reliability can be enhanced by conducting a formal Risk Management Assessment (RMA), gap analysis, and by following the guidelines of the Critical Area Program (CAP). The RMA and the CAP are used in other industries and customized specifically for the needs of Data Center environments. The RMA is an exercise that produces a system of detailed, documented processes, procedures, checks, and balances designed to minimize operator and service provider errors. The practice CAP ensures that only trained and qualified people are associated and authorized to have access to critical sites. These programs, coupled with Probability Risk Assessment (PRA), address the hazards of data center uptime. The PRA looks at the probability of failure of each type of electrical power equipment. Performing a PRA can be used to predict availability, number of failures per year, and annual downtime. The PRA, RMA, and CAP are facilitating agents when assessing each step listed below.

Engineering and design

Project management

Testing and commissioning

Documentation

Education and training

Operation and maintenance

Employee certification

Risk indicators related to ignoring facility process management

Standard and benchmarking

Industry regulations & policies continue to be more stringent than ever. They are heavily influenced by Basel II, Sarbanes‐Oxley Act (SOX), NFPA 1600, and U.S. Securities and Exchange Commission (SEC). Basel II recommends “three pillars” ‐ risk appraisal and control, supervision of the assets, and monitoring of the financial market ‐ to bring stability to the financial system and other critical industries. Basel II implementation involves identifying operational risk then allocating adequate capital to cover potential loss. As a response to corporate scandals in the close to decades ago, SOX came into force in 2002 and passed the following act: The financial statement published by issuers is required to be accurate (Sec 401); issuers are required to publish information in their annual reports (Sec 404); issuers are required to disclose to the public, on an urgent basis, information on material changes in their financial condition or operations (Sec 409); and impose penalties of fines and /or imprisonment for not complying (Sec 802). The purpose of the NFPA 1600 Standard is to help the disaster management, emergency management, and business continuity communities to cope with critical events. Keeping up with the rapid changes in technology has been a longstanding priority. The constant dilemma of meeting the required changes within an already constrained budget can become a limiting factor in achieving optimum reliability.

1.2.1 Levels of Risk

Risk can be described as the worst possible scenario that might occur while performing a task within the facility. Risk assesses how much we know or predict about unforeseen circumstances. As we review risk, management is essential to the facility/IT team as having the proper change management process in place for planned events, and event response procedures in place can ultimately reduce downtime. Reducing the frequency and understanding impact is the key to proper Critical Environment Management. Table 1.1shows the three typical levels of impact, high, medium, and low, as a result of an event occurrence.

Table 1.1 Levels of Risk Impact to Facilities

Risk Impact Effects of System Failure
High It will cause an immediate interruption to the clients’ critical operations such as:Activity requiring a planned major utility service outage, or temporary elimination in system redundancy of the critical environment.Activity that would disrupt critical production operations.Activity that would likely result in an unplanned outage or disruption of operations, if unsuccessful.
Medium There is time to recover without impacting the clients' critical operations including any:Activity requiring a planned service outage that does not affect systems, but may impact non‐critical operations.Activity that involves a significant reduction in system redundancy.Activity that is not likely to result in an unplanned outage to the critical environment or disruption of operations, if unsuccessful.
Low It will not interrupt operations and will have minimum potential of affecting the clients' critical operations including:Activity involving systems directly supporting operations but the execution of which will be transparent to operations.Activity that cannot result in an unplanned outage of the critical environment or impact operations, if unsuccessful.
None Activity not associated with the critical environment.

1.3 Capital Costs versus Operation Costs

Businesses rest at the mercy of the mission critical facilities sustaining them. Each year billions of capital dollars are spent on the electrical and mechanical infrastructure that supports IT around the globe. It is important to keep in mind that downtime can cost companies millions of dollars per hour or more. An estimated 94% of all businesses that suffer a large data loss go out of business within two years regardless of the size of the business. The daily operations of our economic system and our way of life depend on critical infrastructure being available 100% of the time with no exceptions.

Critical industries are operating continuously, 365 days. Because conducting daily operations necessitate the use of new technology, more and more applications are packed into servers, and servers are being packed into a single cabinet. The growing number of servers operating 24/7 increases the need for power, cooling, and airflow. When a disaster causes the facility to experience lengthy downtime, a prepared organization is able to quickly resume normal business operations by using a predetermined recovery strategy. Strategy selection involves focusing on key risk areas and selecting a strategy for each one. Also, in an effort to boost reliability and security, the potential impacts and probabilities of these risks, as well as the costs to prevent or mitigate damages and the time to recover, should be established.

Читать дальше
Тёмная тема
Сбросить

Интервал:

Закладка:

Сделать

Похожие книги на «Maintaining Mission Critical Systems in a 24/7 Environment»

Представляем Вашему вниманию похожие книги на «Maintaining Mission Critical Systems in a 24/7 Environment» списком для выбора. Мы отобрали схожую по названию и смыслу литературу в надежде предоставить читателям больше вариантов отыскать новые, интересные, ещё непрочитанные произведения.


Отзывы о книге «Maintaining Mission Critical Systems in a 24/7 Environment»

Обсуждение, отзывы о книге «Maintaining Mission Critical Systems in a 24/7 Environment» и просто собственные мнения читателей. Оставьте ваши комментарии, напишите, что Вы думаете о произведении, его смысле или главных героях. Укажите что конкретно понравилось, а что нет, и почему Вы так считаете.

x