Root Cause Analyses, or RCA for short (zu deutsch: Fehler-Ursachen-Analyse), get to the bottom of problems. Used correctly, they not only improve the security of IT infrastructures (e.g., Atlassian environments) in companies, but also eliminate errors and thereby increase performance. Especially in large companies and corporations with complex structures, the search for the cause of performance problems is very challenging. It depends crucially on good communication between the employees involved and good coordination of the stakeholders.

Improved Performance and Stability of Atlassian Environments through catworkx Root Cause Analysis
The project at a glance
Requirements
Support with a Root Cause Analysis (RCA) across the entire Atlassian infrastructure and peripheral dependencies.
Solution
- Analysis of the customer's Atlassian environment, development of recommended actions, and joint implementation with the customer
- Setting up a monitoring environment and performing load simulations (ramp-up, stress tests, load tests, functional tests)
Benefits
- Targeted and efficient communication in all directions by involving all stakeholders in the solution finding process
- Bundling of competencies
catworkx has many years of experience in the implementation and operation of Atlassian products and multi-layered infrastructure architectures. In addition, catworkx is adept at navigating complex corporate structures and brings a high level of expertise in project management as well as in collaboration with a wide range of stakeholders.
Root cause analysis at the start of the project
A provider from the telecommunications industry requested support from catworkx for a Root Cause Analysis (RCA) with the goal of improving internal system performance. catworkx accepted the challenge and started the project by analyzing the problem.
The first step involves narrowing down potential sources of error (technical analysis). Subsequently, problems or sources of error are systematically investigated, and measures are taken to permanently eliminate the problem.
As an outside service provider, catworkx takes a bird's-eye view of the problem and assumes coordination between Atlassian, the internal specialist departments, IT, the outsourced infrastructure teams for the database, networks, server operations, app vendors, external consultants, service providers, and/or operators.
Data as the basis for the fact-based exploration of problems
In addition to analyzing which key figures are important for the company and which are not, and defining clear terminology – e.g., "what does better mean, what does faster mean" – the company's internal data forms the necessary basis for the fact-based investigation of problems. Once it has been determined which key figure or value should be used as a starting point (reference value), subsequent changes and deviations – both positive and negative – can be proven: all further measurements are now related to this value. However, not all relevant data needed to resolve an incident is always available. Sometimes data must first be collected in the medium or long term in order to draw conclusions from it later or to see how, for example, the respective application or individual components are performing: well or not. Here, catworkx evaluates relevant data using monitoring – ideally at the customer's site.
The goal that catworkx pursues together with the client: The result is always a well-founded analysis and presentation of the actual situation and the potential. This is not to be generalized: one customer has more and the other customer has less potential for optimization.
Facts through the establishment of measurement environments
If it is not possible to access a developed monitoring environment at the respective customer, the so-called “vitality status” of applications - and all associated components - can be carried out by setting up measurement environments. This means additional work - but it can be worth it: The resulting data can be used - in black and white - to identify certain behaviors and derive correlations, for example, between day and night, specific days of the week or working hours. Bottlenecks and application response times can also be identified. All of this is important information that enables the company to take targeted action and work on the sustainable elimination and thus increase its performance and added value.
The alternative: Set up load simulations correctly and evaluate them optimally
Another possibility is to carry out load simulations: in this case, the company's environment is replicated as accurately as possible. Load simulations can reproduce specific incident categories and thus contribute to finding solutions. The result should always be – regardless of the method – a well-founded analysis of the current situation and the potentials.
Good performance = more profit for a company
In the case of the telecommunications industry customer, all components and infrastructure sections necessary for operation were evaluated. Subsequently, concrete recommendations for action were developed, a rough estimate of feasibility and effort was presented, which was then introduced and discussed with the customer in subsequent and within cyclical meetings. The implementation of the measures took place during the ongoing project as well as in a follow-up project with previously specified components.
Conclusion:
The fact is that companies cannot afford errors and failures in processes and systems in the long run, because poor performance is costly. Root Cause Analyses are an effective tool when it comes to identifying problems. In retrospect, they allow conclusions to be drawn and provide important numerical data. Good performance, on the other hand, means more efficiency, more productivity, and ultimately more profit for the company – as the telecommunications industry customer also recognized.
Contact Us!
We advise you on the entire Atlassian Ecosystem and are happy to support you in optimizing license models and costs.