Formalizing Metamodel of Requirements Management System

. Requirements play an important role in the process of safety critical software development. To achieve reasonable quality and cost ratio a tool support for requirements management is required. The paper presents a formal definition of a metamodel that is used as a basis of Requality requirements management tool. An experience of implementation of the metamodel is discussed.


Introduction
The development of complex systems is always a sophisticated task. The development of complex safety-critical systems, where the cost of errors is especially high, is particularly complicated. Modern best practices suggest that precise and accurate requirements management is an important element to solve that task. Requirements managements in the context of safety-critical system development include the following aspects: • building a catalogue of requirements; • traceability links to sources of requirements; • traceability links from other development artefacts like tests and code to requirements; • configuration and version management; • change management including change impact analysis. The paper presents a formal definition of a metamodel that is used as a basis of Requality requirements management tool that is aimed to cover all the aspects. Implementation details of the metamodel in the tool are also discussed and future directions are considered.

Related works
The problem of requirements management is not a new one. This activity was known as a very important one for years. As an example we may cite a publication from 1997: "The inability to produce complete, correct, and unambiguous software requirements is still considered the major cause of software failure today" [1]. But the requirements engineering task is still the subject of different investigations. Some of them defines a methodology [2], a model [3] or a framework [4]. Also, there are papers presenting development story of some tools, like [5]. Some papers describe both requirements model and its application in a specific tool. For example [6] designs a tool for management of requirements in form of specific models or [7] that defines some details about a feature management tool for product lines. Another paper [8] defines requirements as constraints and examine core concepts related to its implementation in a real tool. There are many commercial requirements management tools with a little information about architecture and implementation details. There only a few open source tools are known and cited in publications like ProR [9] or ReqLine [10]. None of the papers on the tools discusses its core model in a formal way. Some approaches and models are listed in [11] but it specifies mostly methodological aspects.

Preliminaries
The process of software development can be made in different ways. There are some general views on requirements management tool's functions but the set of requirements for this tool in specific areas may be different. One of the ways to deal with such problem is to develop a model for that tool. This approach can be found in [7] or [8]. The model helps to define core concepts of the tool and prove some theorems over its functions. We need to provide some terminology before starting a model. First, we will define what the requirement is. In this paper, requirement means a limitation or definition of some system's or component's functional. For our model requirements are unique objects that may have a specific description written by natural language and are placed in some tree structure defined below.

Base model
• V -a set of vertices.
• E⊂V xV -a set of edges that is an asymmetrical relation on V.
• r 0 ∈V -a root of the tree.
• There are no incoming edges for r 0 and there are no more than one incoming edge for the other vertices. • All vertices are reachable from r 0 . If (v 1 , v 2 ) ∈ E then v 1 is denoted as a parent(v 2 ), while v 2 is called a child of v 1 . We define relation reachable E (v 1 , v 2 ) as a transitive and reflexive closure of the relation E. Definition 2. Attributed tree AT = (G, Key, Value, attrs) consists of: • a tree G = (V, E, r 0 ); • a set of attribute keys Key; • a set of attribute values Value; • a functional relation attrs: V → (Key → Value) that provides each vertex with a set of attributes. A set of all possible attributed trees is denoted as ATrees. An attributed tree is a convenient framework to represent requirements [12] with the following semantics. If a vertex v ∈ V represents a requirement for a target system and there are children v 1 , … v n of v, then the children represent a decomposition of the requirement v. In other words, if a system satisfies to requirement v then it satisfies to all requirements v 1 , … v n and vice versa. Attributes of vertices contain various information about the requirements, for example a unique identifier, description of the requirements in natural language, its representation in a formal notation, version, etc. An interesting particular case is the attributes, whose value is a vertex v ∈ V or a set of vertices vs ⊆ V. It allows to define and to manage relations between different vertices. For example, such attributes can be used to represent traceability links between high level and low level requirements. Formally, this case is achieved if V ∪ (V) ⊆ Value.

The extension of the base model
The base model of requirements catalogue is an attributed tree, where each requirement has a particular set of attributes. This model is convenient for analysis of the catalogue, e.g. for formal analysis, analysis of test coverage, traceability analysis, etc. At the same time, it is difficult to manage such model manually because there are usually many interdependencies between elements and its attributes. Here and after term vertex (element of set V) and elements of requirements catalogue are used interchangeably.
That is why we introduce a declarative model of requirements catalogue that allows us to automate the handling of such dependencies. The purpose of the declarative model is to store requirements catalogue in more compact and manageable way. The declarative model is defined stepwise. Each step is accompanied by definition of the transformation of the declarative model to the raw basic model.

Predicates
If requirements are developed for a product line, there is a number of requirements shared between different variations of the product. A natural wish is to have a single requirements catalogue for the product line and the ability to build a specific one for a particular version of the product. That means there is a need to delete a subset of requirements from the catalogue if the subset is not applicable to the target product. The similar situation happens when a catalogue is used to represent requirements of several revisions of a standard or to represent requirements of a standard with optional elements.
To introduce such ability we propose to choose especial key predicate ∈ Key, whose values are boolean. If an element has attribute predicate with value false, this element and all its children are removed from the catalogue during transformation. The first declarative model DM 1 is an attributed tree ((V, E, r 0 ), Key ⊔ {predicate}, Value, pattrs) that is transformed to the base model ((V', E', r 0 ), Key, Value, attrs) according the following rules:

Calculated attributes
It is an often situation when attribute value depends on values of the other attributes of the same element or even on attributes of the other elements. To express such dependencies explicitly we propose the second declarative model DM 2 that is an attributed tree (G, Key, FValue, fattrs), where The declarative model DM 1 corresponding to the model DM 2 is an attributed tree (G, Key, Value, attrs): To build such requirements model it is required to solve a set of equations defined by fattrs. A simple approach is to apply fixed point iteration, while some additional implementation details will be considered in section V. There are declarative models that define a set of equations with no solutions or with non-unique solutions. A simple but reasonable limitation that allows avoiding such models is a prohibition of cyclic dependencies between attributes. A particular case when an attribute has a constant value val is represented in the declarative model DM 2 as a pair (prj 4 , val), where prj 4 is a projection function by the fourth argument: prj 4 (AT, v, k, val) = val. Please note that in DM 2 predicate is considered as a regular element of the set Key.

Attribute scope
Another often situation happens when an attribute is applicable to the whole subtree and it has the same value for all elements. Or a similar case is when an attribute is applicable to all children of the particular element. To handle such situations we propose the third declarative model DM 3 that is an attributed tree (G, Key, SValue, sattrs), where SValue = FValue × Scope, Scope = {S L , S DC . S S } with an element having the following semantics: • S Lan attribute is available only in the element where it is defined.
• S DCan attribute is available in the element where it is defined and in all its direct children. • S San attribute is available in the element where it is defined and in all its successors. An example of attribute scope can be seen on Fig. 1. White rectangles are Vs. Arrows mean child-parent relation. Attribute with some scope is defined in r 0 . Grey rectangles represent different possible scopes of A and the subtrees where it will be accessible. \A transformation of declarative model DM 3 to the model DM 2 is straightforward: DM 2 is an attributed tree (G, Key, FValue, fattrs), where fattrs(v) = {k → fval} such that (1) and (2) are not applicable ∧

Fig. 1. Attribute scopes
It is interesting to note that nonconstant scoped attributes can get different values in different elements because its function can depend on the vertex as a third argument.

Reuse of subtrees
The next item to consider is a situation when there are several subtrees of requirements that are very similar each other up to some limited number of details. In this case, it would be ideal to have a single copy of the subtree and the ability to clone it with some modifications. This approach is usually called reuse [13]. The fourth declarative model DM 4 is an attributed tree ((V, E, r 0 ), Key ⊔ {cp}, SValue, cpattrs) with especial key cp that satisfies the following constraints: Please note that newDM 4 satisfies both constraints of the fourth declarative model. 7. curDM 4 := newDM 4 and goto step 2. Lemma 1. The algorithm terminates for any DM4 satisfying the constraints. The proof is based on the fact that the cardinality of CC(curDM 4 ) is decreased every iteration because of the choice of the v 0 at step 4 that guarantees that elements with attribute cp are not cloned, while one such element loses that attribute. Lemma 2. The result of transformation does not depend on the order of the selection of elements at step 4. The idea of the proof is that transformations that can be chosen in non-deterministic order make modifications in non-intersecting subtrees. Interesting to note that combination of reuse and predicate transformation can be used to define a generic subtree that is instantiated several times with different arguments using reuse transformation and the original generic subtree is eliminated with predicate transformation. Also, predicate transformation can be useful to eliminate unneeded elements from the cloned subtrees.

Identification
One of the important aspects of requirements management is requirements identification. One of the common approaches it to assign a unique identifier to each object, for example, some number or string. In addition to that it is possible to provide each element with a qualified identifier QID defined recursively on top of identifiers ID that are unique within children of the same parent: r 0 has QID = '/ID', child v has QID = 'QID(parent)/ID'. Let us take some example of requirements for some system. If we use QID we can have a human-readable path for each requirement. For example, we may have an element with QID = "Functional requirements/Ports/req001". As seen from the path it has a parent "Functional requirements/Ports/" and its ID is req001.

Calculated attributes
There are two objects related to attributes in the implementation. The first one, attribute definition A_DEF represents a pair (func,fval) from the formal declarative mode, where func is of type ATrees × V × Key × Value → Value. The second one, attribute A, represents a value of the attribute in the base model. A_DEF is used to calculate an actual value A when it is required. There are several kinds of functions supported in attribute definitions. The first kind is the constant functions prj 4 that always returns fval value stored in attribute definitions. The second kind is template functions that stores in fval value a string with parameters encoded in curly brackets, e.g. "Hello, {K}". The value of the parameters to be used for substitution is taken from attribute with the encoded name, 'K' in the example above, of the same element. The third kind is formula value generator that stores in fval value a string with an expression in a subset of JavaScript language that has access to attributes of the same element. The fourth kind is virtual attributes that are implemented in Java. They have no stored fval value at all, but they have access to the whole context of the element including the complete attributed tree. For example, Label attribute can take value of user-defined Name attribute if there is one or return system-defined identifier otherwise. Another example could be QID that calculates qualified identifier of the element as a concatenation via '/' of parent's QID with a Name of the target element. An important additional information that the tool is able extract from attribute definition is a set of attribute keys which values are required to calculate the actual value for the given attribute by the corresponding function.

Attributes life-cycle
For each attribute stored data includes function kind and fval. The pair (funckind,fval) is denoted A_ST. System-defined virtual attributes have no stored data, they are added to elements on the fly. Let us describe a common process of attributes loading for some requirement.
1. Set of A_ST is loaded from storage to A_DEFS. 2. Set of scoped attributes that are applicable to the target one is taken from its parent and is added to A_DEFS. 3. The A_DEFS set is handled by Attribute_Calculation procedure described below. If attributes are changed by the user using GUI session, the tool has the same A_DEFS set that contains a subset of changed attribute definitions. Then the tool applies the same Attribute_Calculation procedure as follows.
1. A_DEFS set is extended with attributes of the target element that depends on any attribute already belonging to A_DEFS. 2. The order of evaluation of attributes from A_DEFS is calculated. The order can be defined as ORDER = (K 1 ..K n ) where K i is the key of the attribute. -∀ Ki,Kj∈ ORDER if K j depends on K i then i<j.
The algorithm is described in the next section.
3. For each A_DEF in the A_DEFS value of A is calculated and placed to AS. After this procedure AS contains an actual state of attributes after provided changes.

Order extraction algorithm
As an input of order extraction, we have KEYS = (K 1 .. K n ) that is set of attributes name in some random order and DEPS = (K i → (K j1 ..K jm ) ) -a map of attributes dependencies. The algorithm is as follows: 1. ORDER is set to empty collection. 2. OSET is the set of handling nodes. 3. Extract revert dependencies DEPS_R. DEPS_K=(K j → (K i1 .. K il )). If K i depends on K j then DEPS_K contains K i → K j record. 4. Place KEYS to OSET. 5. Set flag MOD is to False. 6. In OSET look for candidate KK with DEPS_K[KK] = KSET that complies one of following rules: • KSET is empty. OR • !∃ K i ∈ KSET: Ki∈ OSET. 7. If K K was found then: 1. MOD set to True. 2. K K removed from OSET. 3. K K added to ORDER. 8. If MOD = True & |SET|!= 0 then go to step 4. At the end of execution, the ORDER will contain the order in which A's values calculation.

Attribute change management
The introduction of scope and calculated attributes requires the management of attributes changes to keep all dependent attributes up-to-date. There are two possible strategies to deal with attribute updates. The first approach is to commit all changes at runtime. The second one is to collect changes in AS and then apply them all by request. Immediate commit is tending to be simpler but more computing -intensive. Late updates require fewer calculations but need more memory. For our tool, we use the second approach because we have large catalogues with a possibility of complex relationships between its elements. Late changes can be defined in form of new object -changes set CS=(K → OP, K→ A OLD , K→ A NEW ) where K is the key of attribute, OP ∈ (remove, create, modi-fy) is the operation over attribute, A OLD is the value of attribute in AS before operation, A NEW is the new attribute value after operation. For attribute changes change set needs to store A_DEFS, so minimal CS = (K, K→ A OLD , K→ A NEW ). To use these changes set we need to extend the model of attributes set of A. When all attribute modifications are collected we need to apply all that changes to calculate actual values of attributes. It is implemented in the same way as it was described in section V.C. One more problem with attribute changes is that some of the changes need to be propagated from one requirement to another. To deal with this problem we define a concept of change propagators. If A_DEF (virtual attributes only for now) depends on attributes from the external element it registers a function-change propagator that is called when some change set is applied to attributes of that element. The change propagator evaluates if the changes impact the target attribute and initiate its recalculation if it is required.

Lazy loading
When we speak about a model of requirements in some common application like avionics we need to take into account the number of distinct requirement. Sometimes the number of artefacts for such models tends to be in the thousands or tens thousands. In that case, direct management of requirements may require a lot of resources.
To solve this problem we use the lazy loading principle. That means that AT will contain only those Vs that are requested during the usage of the model. In most cases that means that in G we have a subtree G L ∈ G that contains r 0 and some subtrees that are used during the current working session. But laziness of model leads to some difficulties. First of all, we need to overlook AT instead of AT L if we need to assure that V with given ID exists. This problem can be solved by caching id-related information in CacheStore that is always available.

Attribute types
In practice, the value of an attribute may have one more propertya type. One possible set of types includes Integer, Boolean, String, Float. Also, we may define types for Collection and Enumeration. In most cases, the value still is the simple constant. But some attribute types cannot be defined as a single value and need to store and manage some additional data. For example, Collection type may use specific object LIST = (T V , V 1 ..V n ) where T V is the type of collection's value and (V 1 ..V n ) are the values stored in the collection. One more specific type is Enumeration. First, enumeration requires definitions of its values. It can be made by means of ENUM_DEF = (V T , V 1 ...V n ) that is similar to LIST one. But to define an attribute with one selected enumeration value we need to define one more object ENUM = (K B , V S ) where K B is the key of A with ENUM_DEF and V S is the selected value. But in a case we introduce an ENUM, we need to ensure that for every ENUM we will have an A D where T = ENUM_DEF and V D will contain V S .

References
One more problem is the implementation of relations between elements of the catalogue. Some tools manage them as the set of specific objects placed in the distinct set. References are also required some additional handling to support its consistency. In a case REF or target V is changed we may need to track its changes and update related REF_VALUE. One more specific problem is reverted links. If we have a relation V 1 → V 2 we may need to know for V 2 that it has a relation to V 1 . This kind of relations is called "reverse references". If links are stored in AT then we may use one more function (V 2 , LN) → V 1 to store reverse relations. If we define a new type of attributes or the specific state of REF_VALUE then we face a problem of keeping it up-to-date. In our model, we store reverts links in the cache in form of (V 2 , LN) → (V 1 … V n ) function. That allows us to easily get revert links on V 2 if the state of cache is valid. In a case of completely loaded AT the problem is not so difficult to solve because we always have the actual state of every V. But we cannot guaranty the V's state in case of a partially loaded AT that happens in case of lazy loading. If we have some loaded AT L ⊆AT , relation (V 1 , LN) → V 2 , V 1 ∉ AT∧ V 2 ∈ AT then if we need to get revert links on V 2 we may need to load the whole AT to be sure that all possible V 1 were found. In our case, this problem is solved by storing reverse links in the cache. But in this case, we still have one necessary problem. Let us introduce some link L(V 1 , V 2 , LN). If we already resolve this link then the record in cache tends to be present. But what if we introduce V 2 in the model when V 1 is loaded and the link is resolved was not found? The situation takes place when V 2 is loaded by the lazy method, created or modified. In the worst case, we need to track changes of the whole AT for all links. A better solution is to manage some kind of scope for which link tends to be resolved. That is not implemented yet, but it is in our plans.
Relations can be used for some specific activities. One of them is changes management. Changes management is performed when some V 1 with links (L 1 ..L n ) is changed. In this case, some operations will be performed on V's obtained from L 1 ..L n . The nature of such operation can be different. For some tools, those Vs will be marked in a model with the specific flag. In other cases, the models can define additional actions depending on the kind of change.

Conclusion
We presented a formal metamodel that is used as a basis for building Requality requirements management tool. We covered different difficulties related to its implementation. But the experience demonstrates that the model allows handling quite big requirements catalogue with many relations between its elements. The future work includes analysis and implementation of new kinds of functions for calculated values and development of user-friendly patterns for solving common user tasks on top of the semantics defined in the paper. Another direction is research of possible compositions of the formal model provided by the tool and formal models used to represent particular requirements.