Cloud Computing for Big Data Analytics Projects
Jaiswal, Jitendra Kumar (2018)
Jaiswal, Jitendra Kumar
Metropolia Ammattikorkeakoulu
2018
All rights reserved
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-201805107466
https://urn.fi/URN:NBN:fi:amk-201805107466
Tiivistelmä
Data alone has no significance unless relevant information is extracted from it to support decision-making. Data analysis is the process of extracting useful information from available data. This information in turn helps decision makers to take appropriate actions. Traditionally, the analysis on data was performed through a process known as Extract, Transform and Load (ETL) on relational database management systems (RDBMS), which were designed primarily for vertical growth ( i.e. adding more Central Processing Unit(CPU) and Random Access Memory(RAM) to systems). As the industry is already in “Exabyte & Zettabyte Age” of data, the traditional approaches like RDBMS have faced limitations to store and process this humongous data due to their architectural principles designed during 70’s. The large amounts of data, structured or unstructured, has been termed as “Big data”, having mainly five properties i.e. Volume, Velocity and Variety, Veracity and Value of which Value is the most important whose main purpose is to extract relevant information from the other four V’s. To solve this problem of humongous data and analysis, various technologies have emerged in past decades. “Cloud computing” is a platform where thousands of servers work together to meet different computing needs and billing is done as per ‘pay as you grow’ model.
This thesis studies Cloud computing and Big data fundamentals and benefits of a Cloud computing platform for Big data analytics projects. Knowing that there are many Cloud Service Providers (CSP’s), this thesis explores Big data analytics solutions available from industry’s top three Cloud computing service providers. The study includes a demo of a Big data analysis project on a leading Cloud computing platform to validate the power of Cloud computing by utilizing publicly available data sets.
This thesis studies Cloud computing and Big data fundamentals and benefits of a Cloud computing platform for Big data analytics projects. Knowing that there are many Cloud Service Providers (CSP’s), this thesis explores Big data analytics solutions available from industry’s top three Cloud computing service providers. The study includes a demo of a Big data analysis project on a leading Cloud computing platform to validate the power of Cloud computing by utilizing publicly available data sets.