The use of multiple production environments can become important during the rollout of updates. Depending on the application, you may want to release updates only to a portion of users. For example, if you're adding a new feature to a web application, you can use load balancing to direct a small portion of users (say 10 percent) to the updated version while everyone else uses the old version. If there's an unforeseen problem with the updated version, it impacts only a fraction of your users.
Of course, this is a simple example, and this approach won't necessarily work in more complex cases. If the application update causes irreversible changes to a huge database, a more cautious approach is needed. This is where it may be necessary to replicate the full production environment and test the update there. If all goes well, you cut everyone over to the updated environment. If things don't go so well, your existing production environment remains intact and functional.
Quality assurance (QA)/test environments are used for the testing of software updates or new applications. QA/test environments may closely mirror production environments to ensure the accuracy of test results. To achieve this parity, you may need to copy over production data to the QA/test environment, but the environments still remain carefully separated. We'll discuss some testing methods later in the chapter.
When sensitive data exists in the production environment, doing a verbatim copy to QA/test may not be feasible. It may be necessary to use dummy data that mimics the production data.
Staging environments are used for building out a system prior to releasing it to production. In reality, a staging environment is just a preproduction environment.
Development environments are typically used by software developers for creating new applications. Organizations that don't develop their own software may not need a dedicated development environment.
Scaling and Architecting Cloud Systems Based on Requirements
One of the prime advantages of cloud computing is that it enables on-demand computing , allowing you to deploy and pay for the computing capacity that you are actually using. This is an attractive alternative to having to absorb the costs of servers sitting on standby to address any bursts or cyclical higher compute requirements, such as end-of-month processing or holiday sales loads if, for example, you are a retailer.
Autoscaling is a cloud feature that automatically adds and removes resources based on demand. By paying only for what you need when you need it, you can take advantage of the immense computing power of the cloud without having to pay for servers that are just sitting idle during times of low demand.
For example, let's look at a small sporting goods retailer that uses a public cloud provider to host its e-commerce website. During normal operations, the retailer runs and pays for three web servers. During times of high demand, autoscaling will provision additional web servers to match the increased load. For example, the retailer may decide to run a TV commercial on a Saturday afternoon televised game. After the commercial airs, the website experiences a huge traffic spike and an increase of online orders. Once the load subsides to normal levels, autoscaling terminates the additional web servers so that the retailer doesn't have to keep paying for them when they're not needed. This works well because the retailer can match the load on the website with the needed amount of computing, memory, storage, and other back-end resources in the cloud. Combining this pay-as-you-go model with autoscaling maximizes cost efficiency because you don't have to expend money to purchase the hardware for any peak loads or future growth. Autoscaling will just provision more capacity when needed. With automation and rapid provisioning, adding capacity can be as simple as a few clicks in a console, and the resources are immediately deployed!
Contrast this scenario with what would happen without autoscaling. If the retailer were stuck with only three web servers, during the traffic spike the servers might slow down or crash. Adding more servers would be a manual, expensive, and time-consuming process that even in a best-case scenario would take several minutes to complete. By that time, the damage would have already been done.
Understanding Cloud Performance
Cloud performance encompasses all of the individual capabilities of the various components as well as how they interoperate. The performance you are able to achieve with your deployment is a combination of the capabilities and architecture of the cloud service provider and how you design and implement your operations.
Ongoing network monitoring and management allow you to measure and view an almost unlimited number of cloud objects. If any parameter extends beyond your predefined boundaries, alarms can be generated to alert operations and even to run automated scripts to remedy the issue. Here are just a few of the things you may want to monitor:
Database performance
Bandwidth usage
Network latency
Storage I/O operations per second (IOPS)
Memory utilization
Delivering High Availability Operations
By implementing a well-architected network using best design practices, and by selecting a capable cloud service provider, you can achieve high availability operations. You and the cloud provider share responsibility for achieving high availability for your applications running in the cloud.
The cloud provider must engineer its data centers for redundant power, cooling, and network systems, and create an architecture for rapid failover if a data center goes offline for whatever reason. As we discussed with computing, network, and storage pools, the cloud provider is responsible for ensuring high availability of these pools, which means that they're also responsible for ensuring redundancy of the physical components that compose these pools.
It's your responsibility as the cloud customer to engineer and deploy your applications with the appropriate levels of availability based on your requirements and budgetary constraints. This means using different regions and availability zones to eliminate any single point of failure. It also means taking advantage of load balancing and autoscaling to route around and recover from individual component failures, like an application server or database server going offline.
Managing and Connecting to Your Cloud Resources
By definition, your cloud resources are off-premises. This raises the question of how to connect to the remote cloud data center in a way that is both reliable and secure. You'll look at this question in this chapter. Finally, you'll learn about firewalls, a mainstay of network security, and you'll see the role of firewalls in cloud management deployments.
Managing Your Cloud Resources
It's instructive to note the distinction between managing your cloud resources and using them. Managing your cloud resources includes provisioning VMs, deploying an application, or subscribing to an SaaS service such as hosted email. You'll typically manage your cloud services in one of three ways:
Web management interface
Command-line interface (CLI)
APIs and SDKs
When getting started with the cloud, one of the first ways you'll manage your cloud resources is via a web interface the cloud provider offers. You'll securely access the web management interface over the Internet. Here are a few examples of what you can do with a typical cloud provider web interface:
Читать дальше