Executive Summary
Storage, scalability, accessibility and security have always been the key problems faced by technology officers across industries ever since their operations have been made to run on enterprise platforms. With the progress made in the area of information and communication technology along with the development and popularity of social media and other data sources, the volume of data generated is huge, which every organization wants to exploit in order to arrive at meaningful intelligence that can help in growing their businesses. Storage of this huge amount of data, its retrieval and the associated costs are challenges for the organizations, regardless of the industry or size. Cloud computing has come up in a big way as a viable option to address most of these challenges and offer numerous other advantages as well. The definition of the term cloud computing and its advantages are described in detail in the following section.
In this report, the Amazon Web Services cloud service offerings are analyzed in detail starting with a case study on Expedia who has adopted the AWS cloud in 2010. Expedia, who started with the project on a small scale has currently rolled out the solution across their global offices and has benefitted immensely with this decision. The requirements of Expedia, the implementation and features of the solution are explained in detail in the next section. The last section of the report is the overview of cloud computing where the cloud computing related terminologies are explained. Several AWS terms, which found mention in the Expedia case study, are also explained in detail in this section.
Based on the research conducted, it can be said with utmost certainty that cloud technology holds the future of computing. More and more organizations are adopting the cloud technology as it has evolved over the years into an extremely reliable, secure and predictable offering. Many service providers including AWS bet big on cloud and have come up with unique packages that are endearing this technology to the masses. Also, it should be noted that the technology, which was once limited to the enterprises has now reached the masses and there is great adoption among the common public.
Introduction
Cloud computing, mostly referred as “the cloud”, explains the delivery of computing resources and services on demand. In cloud computing, the client’s data will reside in the servers hosted by the service provider and the client can access the data as and when it is needed. Basically, the data is transferred over the internet on a pay-as-use basis. Business houses adopt the cloud services either as Software as a Service (SaaS), Platform as a Service (PaaS) or Infrastructure as a Service (IaaS) . Adoption of cloud computing imparts several advantages to the organizations.
First of all it helps organizations to save the investment required for establishing the servers and setting up the network infrastructure.
Operating expenditures are also reduced as the companies need not bother on the ongoing maintenance activities and employing dedicated resources for manning the server room facilities.
Scalability issues are addressed as storage size can be augmented as needed by purchasing it from the provider.
Applications and data can be accessed from any networked computer
Data loss probability is practically nil
PaaS helps organizations to develop applications and enter the market faster
Scalable infrastructure to manage the dynamic workloads
Amazon Web Services
Amazon Web Services (AWS) started operations in 2006 offering IT infrastructure related services to business houses as web services . AWS enabled organizations to replace the upfront capital investment associated with the installation of servers with low variable costs scaling with the business. Hardware and network became easily accessible to the companies who found an attractive proposition in saving thousands of dollars by adopting the AWS solution. Currently AWS provides services to their customers located across the world through their data centers located strategically in the U.S, Europe, Brazil, Australia, Japan and Singapore . Today, the platform is one among the most sought after infrastructure cloud solution offering low-cost pricing, agility, scalability, security, flexibility and elasticity.
The organization selected for the case analysis is Expedia, a leading online travel company which has adopted AWS in 2010 . Expedia’s businesses include leisure and business travel managed by its brands- expeida.com, hotels.com and hotwire.com. Expedia is also involved in power booking for leading airlines, hotels, websites, top consumer brands and other affiliates . AWS was chosen by the company after conducting an extensive evaluation exercise which weighed AWS with other options like on premise virtualization solution and other cloud solutions. The company wanted to support Asia Pacific customers and AWS was the only solution with the required global infrastructure in place to support the requirements. The major focus areas for Expedia while considering the cloud services were the infrastructure, automation and nearness to the customer locations and thus identified AWS as their suitable fit. Pariveda Solutions, a premier consulting partner of AWS was involved in the implementation of the cloud solution .
Expedia Worldwide Engineering (EWE) which supports all Expedia brands adopted AWS as a major enabler to their commitment towards innovation, technology and platform improvements . Accordingly, EWE launched Expedia Suggest Services (ESS) to help their customers enter travel, destination and search entries correctly. The application provided options to the customers in the form of suggestions as they typed enabling them to correctly pick the desired word thereby avoiding the probability of making a wrong entry. This feature also helped in considerable time savings for the customers. Expedia wanted to run ESS in locations closer to the customer as they wanted to minimize any network latency .
Using AWS, the ESS was built and went live within three months. The application used algorithms based on the customer location and provided suggestions based on the historical data. The application was launched initially in Asia Pacific region and was later expanded to cover the U.S west coast and European Union . Indexes and queries were stored using in-house developed tools. The implementation rewarded Expedia with improved performance by eliminating the latency issue which was reduced to 50 milliseconds from the prior AWS figure of 700 milliseconds .
In addition to the ESS, Expedia was also running other high volume applications such as Global Deals Engine (GDE) on ESS by 2011. GDE provides online partners of Expedia with specific deals that allow them to create custom websites and applications using APIs and product inventory tools . Amazon Elastic Map Reduce was provisioned to analyze the data coming from Expedia’s global websites, user interaction and supply data which was stored in Amazon Simple Storage Service (Amazon S3) . AWS provided the feature of auto-scaling which enabled Expedia to match the varying load demand cycles. Also, the company used AWS CloudFormation to deploy its front and back end stack into Amazon Virtual Private Cloud environment . Operational data revealed that Expedia was able to extract 230 percent data processing efficiency by running in AWS with its ability to scale and use the infrastructure efficiently .
In addition to the above mentioned applications, Expedia also developed an identity federation broker using AWS Identity and Access Management and AWS Security Token Service in order to allow system administrators and developers to single sign on to the AWS Management Console . Based on these successful developments and implementations, AWS ESS and GDE services were extended to the remaining geographies of Expedia including the Eastern U.S, parts of E.U and Japan. Later, all the regional AWS accounts were consolidated into one AWS account.
The benefits accrued by Expedia as a result of this implementation were immense. It helped the organization to develop applications much faster, scale up/down depending on the volume and also helped in quick trouble shooting of issues . Data availability was ensured for Expedia’s customers across the world with minimal latency. The company has established a disaster recovery and business continuity plan and is currently working on developing a monitoring mechanism and movement to a single infrastructure.
Overview of Cloud Computing
This section will focus on analyzing the key cloud computing and AWS focused terms, including those which are mentioned in the case study above.
The National Institute of Standards and Technology (NIST) has categorized cloud computing into three major categories- SaaS, PaaS and IaaS.
Software as a Service (SaaS) – The most popular and the fastest growing category is the Software as a Service offering where the software that is hosted in the cloud can be accessed by the end users through the internet. SaaS is not limited to enterprises alone, but is also getting popular with the general public as well. It provides the added advantage of customization where the customer can choose from multiple software components to create a solution specific to their needs and requirements. It should be noted that the terms cloud computing and cloud software, i.e., SaaS are not essentially the same, even though they are being increasingly used interchangeably. SaaS is only the software component of the cloud and there are numerous other components that make cloud computing a reality .
Platform as a Service (PaaS) - In PaaS, pre-fabricated software architecture are made available to the customers over the cloud, which helps them to build software applications and computing solutions without the necessity of starting from grounds up. It is a great enabler in building solutions quickly and cost effectively and provides the inherent advantage of the solution getting hosted on the cloud. As it is offered as a service, it is not required for the end user to install the platform in a local machine or look for upgrading the platform. The end user has to ensure that all the required APIs and online tools are available for the platform, if there is a requirement of integrating the new application with other existing applications . There are several major benefits in adopting PaaS model in addition to the primary advantages of cost and time savings. Some of the other major advantages include -
Lower risks- The platform functionalities come tested, hence the user can be assured of minimal risk in working with the platform
Higher profit- The cost and time saved in developing an application from grounds up translates directly into better profit margins.
Rapid prototyping- PaaS imparts the benefit and convenience of creating and deploying concept applications without writing the codes .
Security and Interoperability- Cloud platforms have been proven to be secure, hence the applications developed on it inherits the security of the platform.
Yet another major advantage is that the PaaS shields the developers or end users from the complexities of the cloud architecture. At no point of time the users are subjected to the associated difficulties and impart them with a seamless experience .
Infrastructure as a Service (IaaS) - In this scenario, the infrastructure is virtual and organizations pay the IaaS provider based on their usage. Organizations are benefitted from not having to spend a huge sum upfront on setting up the server and network infrastructure. They also save the recurring expense of maintaining a team to manage their infrastructure facility. In IaaS, the customers lend virtual servers and data is exchanged over the internet. One may not see the physical server as the virtual server sits on the cloud which can be managed from anywhere through internet with the right access credentials. Simply put, the customer can buy and manage processing time, network capacity, storage and other fundamental resources without spending a fortune upfront. Virtual servers are made possible through virtualization software such as VMware. A single box can load multiple copies of operating systems and can be lend over the internet for the use of the customers .
Private cloud- In a private cloud set up, the entire cloud infrastructure will be dedicated to a single organization and the resources are not shared with others. The cloud can be hosted either at the customer premises or at the vendor premises. They are very expensive to maintain, but provides the highest level of security and is usually adopted by big enterprises who deal with highly confidential data .
Public cloud- In public cloud, all of the cloud infrastructure is hosted by the provider in their premises and access is provided to multiple clients. The clients will have no visibility or control over the cloud infrastructure which is shared by multiple organizations. Public cloud solutions are cheaper compared to other types, but may not provide the security levels of a private cloud .
Hybrid cloud- Hybrid cloud attempts to mix the best of both private and public clouds by providing the unique advantages of each. Some organizations prefer a hybrid cloud set up as they host their confidential data in the private cloud and the other lesser critical applications in the public cloud. By doing so, the organization holds the chance of enjoying the best of both worlds. It is also sometimes referred to as cloud bursting where organizations use their internal infrastructure for normal use, but reaches out to the external cloud vendor during times of peak loads or demands.
Amazon Elastic Compute Cloud (EC2) – EC2 is the central application in the whole of AWS solution. EC2 is used for the creation, management and use of virtual servers running the Windows or Linux operating system over a Xen hypervisor . The machine instances are sized at different levels and rented on an hourly basis. Spread across the data centers located across the world, EC2 applications that are created are highly scalable, fault tolerant and redundant. As the computing instances provided are software based, each instance is scalable enabling the creation of the virtual data center over the cloud.
Amazon Elastic MapReduce- A web service that enables data analysts, researchers, businesses and developers to easily and cost effectively process large amount of data. Elastic MapReduce uses a hosted Hadoop framework running on Amazon EC2 and S3 . The service finds applications in numerous areas such as data analysis in web indexing, log analysis, machine learning, data warehousing, financial analysis, bioinformatics, scientific simulation etc.
Amazon Simple Storage Service (Amazon S3) - Amazon Simple Storage Service or Amazon S3 allows storage of data of sizes varying from 1 byte to 5 GB . In S3, storage containers are called as buckets that serve the function of a directory. In S3, files are not supported and only objects can be stored. The buckets can be named, although it should be unique across all AWS customers. S3 enables the following functionalities-
Creation, editing and deleting of buckets
Uploading, searching and downloading objects to the bucket
Find metadata associate and allow the users to access buckets and objects
S3 provides highly protected storage and excels where the storage is archival in nature. A typical application of S3 is photo sharing sites .
AWS CloudFormation- Providers developers and system administrators with an easy way to crating AWS resources and updating them in an orderly way. CloudFormation provides sample templates to describe the AWS resources. In addition to this, own templates can also be created to describe the resources and the associated runtime parameters essential to run the application. CloudFormation performs the functions of order of provisioning AWS services and making the dependencies work. Once deployed, the AWS resources can be updated in a predictable way. The templates can be visualized as diagrams and can be edited using the drag and drop interface with the AWS CloudFormation Designer .
Amazon Virtual Private Cloud environment (VPC) - VPC acts as the bridge between the existing network at the organization and the cloud. Network resources are connected to the AWS cloud through a Virtual Private Network and extend the firewalls and other security apparatus to their provisioned servers. Additionally, hardware VPN can also be created connecting the corporate data center and the VPC, thus leveraging the AWS cloud as an extension .
AWS Identity and Access Management (IAM) – IAM is a web service that enables organizations to provide and control access of AWS to its users. IAM can be used to control the users (authentication) and the resources that can be accessed (authorization) . The major features of IAM include-
Shared access to AWS account- Permission can be granted for other people to use the key AWS resources without sharing the username or password or access key .
Granular Permissions- The level of permission to different users can be controlled and varied depending on the requirement .
Secure access to applications that run on EC2- Credentials can be securely granted to applications that run on EC2 instances to access the AWS resources like RDS or S3 buckets
Identity federation- The AWS account can be accessed remotely and within the corporate network or an internet identity provider
PCI DSS Compliance- Processing, storage and transmission of credit card data that is compliant with the PCI DSS can be enabled through IAM.
IAM can be accessed through AWS management console, AWS command line tools, AWS SDKs or IAM HTTPS API .
AWS Security Token Service- Interacting with AWS requires security credentials in order to verify the authentication and authorization. AWS Security Token Service (STS) provides temporary security credentials for use with AWS services and related applications. STS normally imparts limited privileges over a limited time period, thus limiting the risk of compromising on the security. The three steps involved in STS are -
Choose the region to be activated
Grant temporary security credentials using STS
Utilize the credentials to access the required resources
STS are beneficial in areas or situations that involve delegation, identity federation, cross-account access and IAM roles. The most important benefit attributed to the use of STS is the security. The temporary nature of the credentials eliminate the requirement manage and embed long term security credentials for the AWS application and resources .
AWS Management Console- It is the browser based GUI for AWS through which the customers can manage their cloud computing, storage and other resources running on the AWS infrastructure. It can be interfaced with all AWS resources like EC2, S3, Elastic Load Balancing (ELB), Relational Database Service (RDS), Auto Scaling, CloudWatch and OpsWorks . The AWS Management Console supports all the popular browsers and has separate consoles for Android and iOS.
References
Amazon Web Services. (2016). About AWS. Retrieved from aws.amazon.com: https://aws.amazon.com/about-aws/
Amazon Web Services, Inc. (2016). Expedia Case Study. Retrieved from aws.amazon.com: https://aws.amazon.com/solutions/case-studies/expedia/?pg=main-customer-success-page
AWS. (2016). Amazon Virtual Private Cloud (VPC). Retrieved from aws.amazon.com: https://aws.amazon.com/vpc/
AWS, Inc. (2016). AWS CloudFormation. Retrieved from aws.amazon.com/cloudformation/: https://aws.amazon.com/cloudformation/
AWS, Inc. or its affiliates. (2016). What is IAM? Retrieved from docs.aws.amazon.com: http://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html
Borko Furht, A. E. (2010). Handbook of Cloud Computing. New York: Springer.
Cary Landis, D. B. (2013). Cloud Computing Made Easy. Virtual Global, Inc.
EdTech. (2013, November 14). What's the Difference Between Public, Private and Hybrid Clouds? Retrieved from edtechmagazine.com: http://www.edtechmagazine.com/higher/article/2013/11/whats-difference-between-public-private-and-hybrid-clouds
IBM. (n.d.). What is Cloud Computing. Retrieved from ibm.com: https://www.ibm.com/cloud-computing/what-is-cloud-computing
Ricky M, M. L. (2015, December 10). Security Token Service, Temporary Security Credentials for AWS. Retrieved from insideaws.com: http://www.insideaws.com/articles-tutorials/security/security-token-service-temporary-security-credentials-aws.html
Rouse, M. (2014, April). Amazon Elastic MapReduce (Amazon EMR). Retrieved from TechTarget: http://searchaws.techtarget.com/definition/Amazon-Elastic-MapReduce-Amazon-EMR
Rouse, M. (2014, April). AWS Management Console. Retrieved from Techtarget.com: http://searchaws.techtarget.com/definition/AWS-Management-Console
Sosinsky, B. (2011). Cloud Computing Bible. Indianapolis: Wiley Publishing, Inc.
Verstraete, S. (2015, July 1). How Expedia Implemented Near Real-Time Analysis of Independent Datasets. Retrieved from AWS Big Data Blog: https://blogs.aws.amazon.com/bigdata/post/Tx1R28PXR3NAO1I/How-Expedia-Implemented-Near-Real-time-Analysis-of-Interdependent-Datasets