Titelaufnahme

Titel
Distributed big data frameworks / by Moritz Becker
Weitere Titel
Verteilte Big Data Frameworks: Eine Übersicht
VerfasserBecker, Moritz
Begutachter / BegutachterinPichler, Reinhard
ErschienenWien, 2017
UmfangIX, 143 Blätter : Illustrationen
HochschulschriftTechnische Universität Wien, Diplomarbeit, 2017
Anmerkung
Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers
SpracheEnglisch
DokumenttypDiplomarbeit
Schlagwörter (DE)verteilte Verarbeitungsmethoden / Big Data
Schlagwörter (EN)distributed processing methods / Big Data
URNurn:nbn:at:at-ubtuw:1-103714 Persistent Identifier (URN)
Zugriffsbeschränkung
 Das Werk ist frei verfügbar
Dateien
Distributed big data frameworks [1.49 mb]
Links
Nachweis
Klassifikation
Zusammenfassung (Englisch)

The ever increasing amount of data that the modern internet society produces poses challenges to corporations and information systems that need to store and process this data. In addition, novel trends like the internet of things even adumbrate a prospectively steeper increase of the data volume than in the past, thereby supporting the relevance of big data. In order to overcome the gap between storage capacity and data access speed while maintaining the economic feasibility of data processing, the industry has created frameworks that allow the horizontal scaling of data processing on large clusters of commodity hardware. The plethora of technologies that have since been developed makes the entrance to the field of big data processing increasingly hard. Therefore, this thesis identifies the major types of big data processing along with the programming models that have been designed to cover them. In addition, an introductory overview of the most important open source frameworks and technologies along with practical examples of how they can be used is given for each processing type. The thesis concludes by pointing out important extension projects to the presented base systems and by suggesting the conduction of a performance-centric comparison of Apache Spark and Apache Hadoop that can help to establish a more profound understanding of the nature of these systems and to identify potential novel research topics.