Dissertation
zur Erlangung des akademischen Grades
Doktor der Ingenieurwissenschaften
(Dr.-Ing.)
der Technischen Fakultät
der Christian-Albrechts-Universität zu Kiel
eingereicht im Jahr 2015
The Kiel Computer Science Series (KCSS) covers dissertations, habilitation theses, lecture notes, textbooks, surveys, collections, handbooks, etc. written at the Department of Computer Science at Kiel University. It was initiated in 2011 to support authors in the dissemination of their work in electronic and printed form, without restricting their rights to their work. The series provides a unified appearance and aims at high-quality typography. The KCSS is an open access series; all series titles are electronically available free of charge at the department’s website. In addition, authors are encouraged to make printed copies available at a reasonable price, typically with a print-on-demand service.
Please visit http://www.informatik.uni-kiel.de/kcss for more information, for instructions how to publish in the KCSS, and for access to all existing publications.
1. Gutachter: | Prof. Dr. Wilhelm Hasselbring Christian-Albrechts-Universität zu Kiel |
2. Gutachter: | Prof. Dr. Michele Lanza University of Lugano |
Datum der mündlichen Prüfung: 30. November 2015
In vielen Unternehmen nimmt die Anzahl der eingesetzten Anwendungen stetig zu. Diese Anwendungen – meist mehrere hunderte – bilden große Softwarelandschaften. Das Verständnis dieser Softwarelandschaften wird häufig erschwert durch, beispielsweise, Erosion der Architektur, personelle Wechsel oder sich ändernde Anforderungen. Des Weiteren können Ereignisse wie Performance-Anomalien häufig nur in Verbindung mit den Anwendungszuständen verstanden werden. Deshalb wird ein möglichst effizienter und effektiver Weg zum Verständnis solcher Softwarelandschaften in Verbindung mit den Details jeder einzelnen Anwendung benötigt.
In dieser Arbeit führen wir einen Ansatz zur live Trace Visualisierung zur Unterstützung des System- und Programmverständnisses von großen Softwarelandschaften ein. Dieser verwendet zwei Perspektiven: eine Landschaftsperspektive mit UML Elementen und eine Applikationsperspektive, welche der 3D Softwarestadtmetapher folgt. Unsere Hauptbeiträge sind 1) ein Ansatz, genannt ExplorViz, um live Trace Visualisierung von großen Softwarelandschaften zu ermöglichen, 2) ein Überwachungs- und Analyseansatz, welcher in der Lage ist die große Anzahl an Methodenaufrufen in einer großen Softwarelandschaft aufzuzeichnen und zu verarbeiten und 3) Anzeige- und Interaktionskonzepte für die Softwarestadtmetapher, welche über klassische 2D Anzeige und 2D Eingabegeräten hinausgehen.
Umfassende Laborexperimente zeigen, dass unser Überwachungs- und Analyseansatz für große Softwarelandschaften elastisch skaliert und dabei nur einen geringen Overhead auf den Produktivsystemen erzeugt. Des Weiteren demonstrieren mehrere kontrollierte Experimente eine gesteigerte Effizienz und Effektivität beim Lösen von Verständnisaufgaben unter Verwendung unserer Visualisierung. ExplorViz ist als Open Source Anwendung verfügbar unter www.explorviz.net. Zusätzlich stellen wir umfangreiche Pakete für unsere Evaluierungen zur Verfügung um die Nachvollziehbarkeit und Wiederholbarkeit unserer Ergebnisse zu ermöglichen.
In many enterprises, the number of deployed applications is constantly increasing. Those applications – often several hundreds – form large software landscapes. The comprehension of such landscapes is frequently impeded due to, for instance, architectural erosion, personnel turnover, or changing requirements. Furthermore, events such as performance anomalies can often only be understood in correlation with the states of the applications. Therefore, an efficient and effective way to comprehend such software landscapes in combination with the details of each application is required.
In this thesis, we introduce a live trace visualization approach to support system and program comprehension in large software landscapes. It features two perspectives: a landscape-level perspective using UML elements and an application-level perspective following the 3D software city metaphor. Our main contributions are 1) an approach named ExplorViz for enabling live trace visualization of large software landscapes, 2) a monitoring and analysis approach capable of logging and processing the huge amount of conducted method calls in large software landscapes, and 3) display and interaction concepts for the software city metaphor beyond classical 2D displays and 2D pointing devices.
Extensive lab experiments show that our monitoring and analysis approach elastically scales to large software landscapes while imposing only a low overhead on the productive systems. Furthermore, several controlled experiments demonstrate an increased efficiency and effectiveness for solving comprehension tasks when using our visualization. ExplorViz is available as open-source software on www.explorviz.net. Additionally, we provide extensive experimental packages of our evaluations to facilitate the verifiability and reproducibility of our results.
by Prof. Dr. Wilhelm Hasselbring
Software visualization is a non-trivial research field, and with his thesis Florian Fittkau has made original contributions to it. Florian Fittkau investigates how “live trace visualization” can be leveraged for the analysis and comprehension of large software landscapes. Specific contributions are the ExplorViz method with its monitoring and trace processing for landscape model generation, as well its 3D visualization.
Highly innovative are the new techniques for printing such 3D visualizations into physical models for improved program comprehension in teams and the new techniques for immersion into these 3D visualizations via topical virtual reality equipment.
Besides the conceptual work, this work contains a significant experimental part and a multifaceted evaluation. This engineering dissertation has been extensively evaluated with advanced student and lab experiments, based on a high-quality implementation of the ExplorViz tools.
This thesis is a good read and I recommend it to anyone interested in recent software visualization research.
Wilhelm Hasselbring Kiel,
December 2015
This chapter provides an introduction to this thesis. Chapter 1.1 describes the motivation for our research. Afterwards, Chapter 1.2 presents the scientific contributions. Preliminary work is discussed in Chapter 1.3. Finally, Chapter 1.4 lists the structure of this thesis.
Parts of this chapter are already published in the following works:
In many enterprises, the number of software systems is constantly increasing. This can be a result of changing requirements due to, e.g., changing laws or customers, which the company has to satisfy. Furthermore, the legacy systems often interact with each other through defined interfaces. For example, the database may be accessed by different programs. In the whole, the applications form a large, complex software landscape [Penny 1993] which can include several hundreds or even thousands of applications.
The knowledge of the communication, internals, and utilization of this software landscape often gets lost over the years [Moonen 2003; Vierhauser et al. 2013] due to, for instance, missing documentation. For those software landscapes, tools that support the program and system comprehension of the software landscape become important. For example, they can provide essential insights into the landscape in the maintenance phase [Lewerentz and Noack 2004]. A software engineer might need to create or adapt features in the landscape. Therefore, she often needs to know the communication between the existing programs and also the control flow inside the application is of interest to find the locations where she needs to do the adaptations [Koschke and Quante 2005]. In this context, the goal of DFG SPP1593 “Design For Future - Managed Software Evolution” is to invent approaches for so-called knowledge-carrying software to overcome the challenges of missing documentation [Goltz et al. 2015].
Another challenge concerning a large software landscape is the question which applications are actually used and to what extent they are used. The operation and support of software can cause substantial costs. These costs would not incur when the unused application gets removed from the software landscape. However, asking every user whether she uses each application is often not applicable and if it is, she might indirectly use applications such as a database, for instance.
Recent approaches in this field of software visualization, e.g., [Panas et al. 2003; Greevy et al. 2006; Wettel and Lanza 2007; Hamou-Lhadj 2007; Dugerdil and Alam 2008], focus on the visualization of a single application. A drawback of visualizing only one application is omitting the communication and linkage between the applications involved in a transaction.
Another drawback of current approaches is the possible lack of traces associated to a feature. For example, a software engineer might analyze a feature called add to cart. The investigation of this feature might lead to interest in the related feature checkout cart. However, this feature might not be available as a trace. Often the required trace can be generated manually for one application but this can become cumbersome in a large software landscape. In addition, one trace can only reveal information on its particular execution of operations, for instance, the response time of this single execution of a operation. If this response time is a statistical outlier, the user might draw false conclusions about the application.
Due to the huge amount of method calls conducted in a large software landscape – typically millions of method calls per second –, monitoring and creating the required traces of the executions for the visualization can become a further challenge [Vierhauser et al. 2013]. One server is not capable of processing such a huge amount of data in parallel to the actual execution of the software landscape.
This thesis makes the following three major scientific contributions (SC1 – SC3) including nine subcontributions:
SC1: An approach named ExplorViz for enabling live trace visualization of large software landscapes
SC1.1: A software landscape visualization featuring hierarchies to provide visual scalability
SC1.2: An interactive extension of the software city metaphor for exploring runtime information of the monitored application
SC1.3: A landscape meta-model representing gathered information about a software landscape
SC1.4: A proof-of-concept implementation used in three controlled experiments for comparing our visualization approach to the current state of the art in system and program comprehension scenarios
SC2: A monitoring and analysis approach capable of logging and processing the huge amount of conducted method calls in large software landscapes
SC2.1: A scalable, elastic, and live analysis architecture for processing the gathered monitoring data by using cloud computing
SC2.2: A proof-of-concept implementation used in three lab experiments showing the low overhead of the monitoring approach, and the scalability and elasticity of our analysis approach by monitoring up to 160 instances of a web application
SC3: Display and interaction concepts for the software city metaphor beyond classical 2D displays and 2D pointing devices
SC3.1: A gesture-controlled virtual reality approach for the software city metaphor
SC3.2: An approach to create physical 3D-printed models following the software city metaphor
SC3.3: Proof-of-concept implementations and a controlled experiment comparing physical 3D-printed models to using virtual models on the computer screen in a team-based program comprehension scenario
For all evaluations, we provide experimental packages to facilitate the verifiability, reproducibility, and further extensibility of our results. In the following, each contribution is described.
SC1: ExplorViz Approach For Enabling Live Trace Visualization of Large Software Landscapes
The first scientific contribution (SC1) of this thesis is an approach to enable live trace visualization for large software landscapes named ExplorViz which supports a software engineer during system and program comprehension tasks. Our live trace visualization for large software landscapes combines distributed and application traces. It contains a 2D visualization on the landscape level. In addition, it features a 3D visualization utilizing the software city metaphor on the application level. By application level, we refer to the issues concerning one application and only this application. Whereas the landscape level provides knowledge about the different applications and nodes in the software landscape.
Since a live visualization updates itself after a defined interval, we feature a time shift mode where the software engineer can view the history of old states of the software landscape. Furthermore, she is able to jump to an old state and pause the visualization to analyze a specific situation.
To cope with the high density of information which should be visualized, the major concept of ExplorViz is based on interactively revealing additional details, e.g., the communication on a deeper system level, on demand. The concept is motivated by the fact that the working memory capacity of humans is limited to a small amount of chunks [Ware 2013]. Miller [1956] suggests seven, plus or minus two, chunks which is also referred to as Miller’s Law. The ExplorViz concept also follows Shneiderman’s Visual Information-Seeking Mantra: “Overview first, zoom and filter, then details on demand” [Shneiderman 1996].
This contribution contains four subcontributions (SC1.1 – SC1.4) which are briefly described in the following.
SC1.1: Software Landscape Visualization Featuring Hierarchies to Provide Visual Scalability Our landscape-level perspective shows the nodes and applications of a software landscape. In addition, it summarizes nodes running the same application configuration into node groups. These equal application configurations typically exist in cloud environments. However, to understand the overall architecture of the software landscape, the user is interested in the existing application configuration. Afterwards, the details about the concrete instances can be interactively accessed.
To provide further visual scalability, the nodes and node groups are visualized within their belonging systems which act as an organizational unit. Again, the details about a system can be accessed interactively and out-of-focus systems can be closed to show only details about relevant systems.
SC1.2: Interactive Extension of the Software City Metaphor for Exploring Runtime Information On the application level, we use the 3D software city metaphor to display the structure and runtime information of a monitored application. Again, the visual scalability is provided by interactivity. When accessing the perspective, the components are only opened at the toplevel, i.e., details are hidden. In our terms, components are organizational units provided by the programming language, e.g., packages in Java. By interactively opening and closing the components, the software engineer is able to explore the application and the gathered runtime information.
SC1.3: Landscape Meta-Model for Representing Information of a Software Landscape Furthermore, we provide a landscape meta-model for representing the gathered information of the software landscape. This model can be used as input for other tools. Thus, the gathered data is also reusable for other scenarios, e.g., automatically updating the configuration of an enterprise application landscape based on the monitoring data.
SC1.4: Proof-of-Concept Implementation Used in Three Controlled Experiments The full ExplorViz approach is implemented as open-source software and available from our website.1 To evaluate our live trace visualization approach, we conducted three controlled experiments.
The first controlled experiment compared the usage of ExplorViz to using the trace visualization tool EXTRAVIS [Cornelissen et al. 2007] in a program comprehension scenario of the quality tool PMD.2 The experiment showed that ExplorViz was more efficient and effective than EXTRAVIS in supporting the solving of the defined program comprehension tasks. The second experiment was a replication of this experiment design where we used a smaller object system named Babsi.3 In this replication, the used time difference was not significant. However, the correctness of the task solution was significantly increased in the ExplorViz group. The third experiment compared our hierarchical landscape-level perspective to a mix of flat state-of-the-art landscape visualizations found in Application Performance Management (APM) tools in a system comprehension scenario. Again, the time difference was not significantly different but the correctness of the solutions was significantly increased in the ExplorViz group.
SC2: Monitoring and Analysis Approach for Applications in Large Software Landscapes
In large software landscapes, several millions of method calls can be conducted each second. Therefore, the monitoring and the analysis approach requires to scale with the size of the software landscape. Furthermore, the approach should be elastic to avoid producing unnecessary costs. A further requirement for the approach is the low overhead of the monitoring to keep the impact on the production systems as low as possible. According to those requirements, we developed our monitoring and analysis approach which is outlined in the following.
SC2.1: Scalable, Elastic, and Live Analysis Architecture Using Cloud Computing To provide a scalable, elastic, and live monitored data analysis approach, we utilize cloud computing and an automatic capacity manager named CapMan.4 Our approach is similar to the MapReduce pattern [Dean and Ghemawat 2010] but we feature multiple dynamically inserted preprocessing levels. When the master analysis node impends to get overutilized, a new preprocessing worker level is automatically inserted between the master and the monitored applications and thus the CPU utilization of the master node is decreased. If it impends to get overutilized again, another level of workers is inserted. In theory, this happens every time the master impends to get overutilized. If a worker level is not utilized enough anymore, it is dynamically removed and thus resources are saved.
SC2.2: Proof-of-Concept Implementation Used in Three Lab Experiments We implemented our monitoring and analysis approach as proof-of-concept implementation and provide necessary additional components such as the capacity manager as open-source software on our website. For the evaluation of our monitoring and analysis approach, we conducted three lab experiments.
We evaluated the low overhead in the first lab experiment by comparing Kieker5 [van Hoorn et al. 2012], which was already shown to impose a low overhead [Eichelberger and Schmid 2014], to our monitoring component using the monitoring benchmark MooBench [Waller 2014].6 As a result, we achieved a speedup of about factor nine and a 89% overhead reduction.
The second lab experiment extended the first experiment by the live analysis of the generated monitoring data. This experiment showed that adding the analysis step only negligibly impacts the throughput and thus is capable of live analyzing the monitored data. Furthermore, we achieved a speedup of about 250 in comparison to Kieker.
We used our private cloud for the third lab experiment to evaluate the scalability and elasticity of our approach by monitoring elastically scaled JPetStore7 instances. In the peak, 160 JPetStore instances were monitored by our approach with two dynamically started worker levels resulting in about 20 million analyzed method calls per second.
SC3: Display and Interaction Concepts for the Software City Metaphor
In addition to providing a live trace visualization, we investigated new ways to display and interact with the software city metaphor [Knight and Munro 1999] beyond the display on classical 2D monitors and usage of classical 2D pointing devices. For a more immersive user experience, we provide a Virtual Reality (VR) approach featuring an Oculus Rift DK18 as display and a Microsoft Kinect v29 for gesture recognition. Furthermore, we construct physical 3D-printed software city models from our application- level perspective to enhance, for instance, the amount of conducted gestures in a team-based program comprehension scenario. Both approaches and an evaluation are described in the following.
SC3.1: Gesture-Controlled Virtual Reality Approach By using an Oculus Rift DK1 and Microsoft Kinect v2 for our VR approach, we achieve a more immersive user experience for exploring the software city metaphor. The Oculus Rift enables to perceive the model in 3D as if the user is flying above the city. To provide an even more immersive experience, we utilize gestures for interacting with the model.
SC3.2: Approach to Create Physical 3D-Printed Software City Models We construct physical 3D-printed models following the software city metaphor of our application-level perspective and detail four envisioned scenarios where physical models could provide benefits. These are team-based program comprehension, effort visualization in customer dialog, saving digital heritage, and educational visualization.
SC3.3: Proof-of-Concept Implementations and a Controlled Experiment for Physical 3D-Printed Models For both approaches, we provide proofof-concept implementations available in branches of our ExplorViz Git repository.10 Furthermore, we conducted a controlled experiment investigating the first envisioned usage scenario for the physical models in a team-based program comprehension scenario. Teams (pairs of two subjects) in the experimental group solved program comprehension tasks using only a 3D-printed model and the control group solved the tasks using a virtual model on the computer screen. Two discussion tasks were influenced positively by using the 3D-printed model and one task was influenced negatively. We attribute the positive influence to an observed increased amount of conducted gestures and the negative influence to less readable labels in the 3D-printed model.
This thesis builds on preliminary work which was already published in several research papers. Furthermore, it bases on various student theses which were co-supervised by the author. In the following, we first briefly describe and list our publications according to three categories: Approach, Evaluations, and Support Projects. Papers fall into the former two categories if they are closely related and explicitly contribute to those parts of this thesis. The latter category contains work that is related but only indirectly contributes to this thesis. Afterwards, the related student theses and their contributions to this thesis are briefly presented.
In this publication, we present our overall ExplorViz method and each of its steps. Furthermore, first sketches of the landscape-level and application-level perspective are shown.
In this work, we describe the idea of multiple worker levels for analyzing the huge amount of generated monitoring records. Therefore, a worker and master concept and a scaling architecture are introduced. In addition, we show the MooBench benchmark results for comparing the analysis component of Kieker 1.8 and ExplorViz.
This technical report presents a plan of the contributions of this thesis and details an evaluation scenario for the application-level perspective.
Architecture conformance checking is introduced as a further usage scenario beneath supporting system and program comprehension in this paper. Furthermore, we present a preliminary study of the scalability and thus applicability of our analysis approach.
Performance analysis is a further usage scenario of our approach. Beneath introducing important aspects of the functionality for the performance analysis, we exemplify it on monitoring data gathered from the Perl-based application EPrints11 in this publication.
In this work, we present our approach to use VR for exploring the application-level perspective to provide an immersive experience. To enable VR, we use an Oculus Rift and provide further gesture-based interaction possibilities using a Microsoft Kinect.
The approach of constructing physical 3D models of our application-level perspective is presented in this work. Additionally, four potential usage scenarios for these physical models are described.
IT administrators often lack trust in automatic adaption approaches for their software landscapes. Therefore, we developed a semi-automatic control center concept which is presented in this publication. This control center concept is used as a target specification in our extensibility evaluation for the ExplorViz implementation.
In this publication, we present a structured benchmark-driven performance tuning approach exemplified on the basis of Kieker. The last performance tuning step is equal to our developed monitoring component of ExplorViz. Therefore, the paper contains a performance comparison between Kieker and the monitoring component of ExplorViz.
Providing efficient and effective tools to gain program comprehension is essential. Therefore, we compare the application-level perspective of ExplorViz to the trace visualization tool EXTRAVIS in two controlled experiments to investigate which visualization is more efficient and effective in supporting the program comprehension process.
Basing on the application-level visualization, we construct physical, solid 3D-printed models of the application. This publication presents the results of a controlled experiment comparing the usage of physical models to using virtual, on-screen models in a team-based program comprehension process.
In this publication, we present our scalable and elastic approach for processing the monitoring data. Furthermore, we describe an evaluation where we monitor 160 JPetStore instances and use dynamically inserted analysis worker levels.
This work describes a controlled experiment where we compare our landscape-level perspective with a landscape visualization derived from current APM tools. As demo landscape, we modeled the technical IT infrastructure of the Kiel University.
We developed a simulator to rate one Cloud Deployment Option (CDO) named CDOSim. After simulating one CDO, the simulator provides an overall rating of the option which is constructed from three metrics, i.e., response times, costs, and Service Level Agreement (SLA) violations.
As basis for CDOSim, the cloud simulator CloudSim was used. Since CloudSim assumes that the user intends to simulate how her infrastructure performs as a cloud platform, we had to integrate the user-centric perspective of a cloud environment user. These enhancements are detailed in this work.
CDOSim rates one CDO. However, a software engineer conducting a migration to a cloud environment intends to find the most suitable deployment for her context. Therefore, the tool CDOXplorer generates CDOs and rates them by calling CDOSim. Since often a huge amount of options exists, CDOXplorer uses genetic algorithms to generate promising CDOs.
SynchroVis uses the 3D city metaphor to display concurrency of one application. It requires to do a static analysis before the program traces can be loaded. Concurrency, e.g., acquiring and releasing a lock object, is displayed through a special lock building. The different threads are visualized through different colors.
This submitted article is an extended version of [Frey et al. 2013] and provides details of the structure, functioning, and quality characteristics of CDOXplorer.
CAPS aims to simplify the instrumentation process of the monitored applications. For instrumentation, an application needs to be uploaded on a web interface and CAPS integrates the required probes to capture provenance data during the execution of the monitored applications.
This is an abstract summarizing the results of [Frey et al. 2013].
This work includes the presentation of a monitoring component for Perl-based systems using Kieker12 [van Hoorn et al. 2012]. Furthermore, a performance analysis using different visualizations is conducted for EPrints.
In his bachelor’s thesis, Beye evaluated three communication technologies with respect to their performance in the context of transferring monitoring data. The evaluation resulted in identifying TCP as the fastest transportation technology for monitoring data. Therefore, we implemented our communication between monitoring and analysis component with a TCP connection.
Koppenhagen investigated several cloud scaling approaches. For his evaluation, he implemented the three most promising scaling approaches and evaluated them with regard to performance and cost efficiency by means of a cloud simulator. These approaches can further enhance our elastic trace processing approach but the implementation remains as future work.
The bachelor’s thesis of Kosche evaluated several concepts of user action tracking for our web-based visualization aiming to record user actions during an experiment run. Since the aspect-oriented frameworks were not compatible with current Google Web Toolkit (GWT) versions, she integrated a manual approach. The recorded data of the experiment runs is part of each experimental package of our controlled experiments.
Matthiessen developed a general Remote Procedure Call (RPC) monitoring approach in cooperation with the master’s project at that time. This backpacking approach is still used in the current monitoring component. As proof-of-concept, he implemented and evaluated the concept for monitoring HTTP Servlet connections.
In his bachelor’s thesis, Stelzer conducted a preliminary version of our elasticity evaluation for the live processing of monitoring data using cloud computing and thus tested our approach with multiple static worker levels in his study. In his experiment, there were up to nine monitored JPetStore instances.
Weißenfels investigated several trace reduction techniques for their efficiency and effectiveness. The best trace reduction technique was trace summarization which is therefore also used by our analysis component.
The bachelor’s thesis of Barbie investigated a different layout algorithm for the 3D city metaphor and thus our application-level perspective. Targeting a stable and compact layout, he developed a layout algorithm using quad trees [Finkel and Bentley 1974]. His evaluations show that under certain circumstances the layout is stable. However, it uses a large amount of calculation time when the 3D city model grows and is not compact in most scenarios.
When no or inappropriate organizational units for classes are used by the monitored software, e.g., all classes are contained in one Java package, our interactive approach would not work as intended. Therefore, Barzel integrated a hierarchical clustering feature into ExplorViz basing on the relations between the classes and their names.
Finke implemented the automatic tutorial mode into ExplorViz. Furthermore, she developed a configurable experimentation mode which shows generic linked question dialogs. Both modes were successfully used in our controlled experiments.
In the context of the bachelor’s project “Control Center Integration”, Gill integrated the capacity management phase into ExplorViz. As foundation, we provided our capacity manager CapMan which is also used for scaling our trace analysis nodes in the cloud. Furthermore, he implemented the migration of applications from one node to a target node.
Mannstedt integrated the anomaly detection phase into ExplorViz in the context of the bachelor’s project “Control Center Integration”. As basis for the anomaly detection, he used OPAD developed by Bielefeld [2012]. As evaluation, he compared different anomaly detection algorithms.
Also in the context of the bachelor’s project “Control Center Integration”, Michaelis integrated the root cause detection phase into ExplorViz. He implemented four algorithms for root cause detection and evaluated them by a comparison.