Specification-Based Test Program Generation for MIPS64 Memory Management Units

. In this paper, a tool for automatically generating test programs for MIPS64 memory management units is described. The solution is based on the MicroTESK framework being developed at the Institute for System Programming of the Russian Academy of Sciences. The tool consists of two parts: an architecture-independent test program generation core and MIPS64 memory subsystem specifications. Such separation is not a new principle in the area: it is applied in a number of industrial test program generators, including IBM’s Genesys-Pro. The main distinction is in how specifications are represented, what sort of information is extracted from them, and how that information is exploited. In the suggested approach, specifications comprise descriptions of the memory access instructions, loads and stores, and definition of the memory management mechanisms such as translation lookaside buffers, page tables, table lookup units, and caches. A dedicated problem-oriented language, called MMU SL, is used for the task. The tool analyzes the MMU SL specifications and extracts all possible instruction execution paths as well as all possible inter-path dependencies. The extracted information is used to systematically enumerate test programs for a given user-defined test template and allows exhaustively exercising co-execution of the template instructions, including corner cases. Test data for a particular program are generated by using symbolic execution and constraint solving techniques.


Introduction
A computer memory is known to be a complex hierarchy of data storage devices varying in volume, latency and price.In addition to registers and main memory, Kamkin A.S,, Kotsynyak A.M. Specification-Based Test Program Generation for MIPS64 Memory Management Units.Trudy ISP RAN /Proc.ISP RAS, vol. 28, issue 4, 2016, pp. 99-114.100 microprocessors include a multi-level cache memory and address translation buffers.The set of devices responsible for handling memory accesses is referred to as a memory subsystem or a memory management unit (MMU).Being one of the key microprocessor components, the memory subsystem is strongly required to be correct and reliable.Due to the complicated structure of the memory, the number of situations that can occur in processing load and store instructions is huge; this makes it improbable to verify the subsystem «manually».It is widely accepted that test program generation (TPG) is an essential approach to microprocessor verification [3].The problem is how to overcome the complexity and at the same time provide acceptable test coverage.It is a fallacy that (naive) random TPG is a good way to optimize testing [4].A better solution, we think, is a specification-based approach [3].A TPG tool consists of two components: (1) an architecture-independent test generation core and (2) an architecture specification, or model.The approach reduces the efforts to create a generator by reusing the corethe only thing one needs to develop is a specification.There exist a number of tools implementing the paradigm mentioned above [3], [5], [6].However, only few of them are distributed under open licenses.ISP RAS's MicroTESK [7] is one of those few.The tool uses a dialect of the nML language [8] for specifying instruction set architectures (ISA) and an extensible set of dedicated languages for specifying particular microarchitectural features, including, first of all, a memory management.In this work, we would like to share our experience in creating a MicroTESK-based TPG for verifying MIPS64 MMUs [1], [2].The remainder of this paper is divided into four sections.Section 2 contains a brief survey of the existing approaches to TPG for MMUs.Section 3 presents the MicroTESK framework and its facilities aimed at MMU specification and testing.Section 4 studies application of the TPG approach for MIPS64 MMU.Section 5 discusses the results of the work and outlines directions of future research and development.

Related work
There are several TPG tools based on formal specifications of memory subsystems.IBM's DeepTrans [9] uses a dedicated specification language.Address translation is depicted as a directed acyclic graph (DAG) whose vertices correspond to the process stages and whose edges relate to the transitions between the stages.A path from the source of the DAG to the sink defines a particular situation in the address translation.Such situations can be referred from high-level descriptions of test programs (TPs), so-called test templates (TTs).The latter are processed by Genesys-Pro [3], which formulates constraints on instruction operands, solves them and transforms the solutions into the instruction sequences.The major advantage of the approach is the use of the highly developed languages for modeling MMUs and describing TTs.A possible disadvantage is that the tool seems not to be able to automatically extract MMU-related dependencies between instructions.
In [10], the Java language coupled with a special library is used to model MMUs.As in DeepTrans, the situations correspond to the paths in the DAG describing the MMU.For example, {TLB(va).hit,TLB(va).entry.V, L1(pa).hit}:there is a hit in the translation lookaside buffer (TLB); the matched entry is valid; there occurs a miss in the first-level cache (L1).In addition, the approach provides facilities for specifying MMU-related dependencies between instructions.For example, {TLB ↦ tagEqual, L1 ↦ indexEqual}: instructions access different TLB entries; data are mapped onto the same set of L1.TTs are constructed automatically by combining situations and dependencies for short sequences of instructions.Building TTs and creating TPs is done by MicroTESK (version 1) [7].The strength of the approach is systematic TT enumeration that takes into consideration instruction execution paths as well as dependencies between instructions.The principal weakness is underdeveloped specification facilities.

MicroTESK Framework
MicroTESK (version 2.3 or higher) [11] combines the advantages of the approaches presented in [9] and [10].The tool inputs are ISA specifications in nML [8], MMU specifications in MMUSL (MMU Specification Language) and TTs in Ruby [12].The basic principles of MicroTESK are close to ones implemented in Genesys-Pro [3].The specifications are analyzed to extract testing knowledge (situations and dependencies), which is used to generate TPs from the given TTs as well as to systematically enumerate TTs.More information on the tool can be found in [13] and [14].Here we provide a brief introduction to MicroTESK by the example of an MIPS64 MMU [2].

ISA Specifications
ISA specifications include definitions of data types, constants, registers, access modes, memories and instructions.Here comes an example (a fragment of the MIPS64 specification), where there are listed three data types, BYTE, SHORT and DWORD.

type BYTE = card(8)
// unsigned 8-bit vectors type SHORT = int(16) // signed 16-bit vectors type DWORD = card(64) // unsigned 64-bit vectors Registers of the same type are grouped into arrays.Register access logic is encapsulated in so-called modes, which, besides other things, define assembly format (syntax) and binary encoding (image) of the registers.The following example declares an array GPR, consisting of thirty two 64-bit registers, designates a stack pointer alias SP = GPR [29], and defines a mode REG aimed at accessing those registers.
reg GPR [32,DWORD] // Array of 32 DWORD Registers reg SP[DWORD] alias = GPR[29] // Stack Pointer Alias mode REG (i: card(5)) = GPR[i] // One-to-One Mapping syntax = format("r%d", i) // Assembly Format image = format("%5s", i) // Binary Encoding number = i // Custom Attribute Like a group of registers, a memory unit is represented as a plain array.In the example below, an array MEM is interpreted as a physical memory comprised of 2 bytes.
Virtual memory issues such as address translation, caching, and the like are specified separately with the use of a dedicated language (see the next section).
mem MEM[2 ** 36, BYTE] // Physical Memory Array The attributes of instructions include syntax, image and action.Actions of load and store instructions are described in an intuitive manner by reading or writing data from or to the array representing the physical memory.Here is a specification of the Load Byte instruction (LB), which derives an address from a base register (base) with given offset (offset), loads a byte from the memory, and writes it to a register (rt).Notwithstanding MEM is interpreted as the physical memory, it is accessed through virtual addresses -an access triggers the address translation mechanisms and other MMU logic.

MMU Specifications
Being rather simple, nML does not have adequate facilities to describe MMUs.For this purpose, a special MMUSL language is used.MMU specifications include address types, memory segments, buffers, and control logic for handling loads and stores.In the following example, address type, VA, is declared.It is a structure with single field -address itself.
address VA(vaddress : 64) A memory segment is considered as a mapping from a set of addresses of some type to a set of addresses of another type.An example given below defines a segment XKPHYS that maps a VA of the given set (range) to the physical address (PA).The segment performs flat translation with no use of TLBs and tables (read).
segment XKPHYS (va: VA) = (pa: PA) range = (0x8000000000000000, 0xbfffFFFFffffFFFF) read = { pa.paddress = va.vaddress<35..0>; pa.cca = va.vaddress<61..59>; } Buffers (TLBs, cache units, page tables, etc.) are specified with the following parameters: the associativity (ways), the number of sets (sets), the entry format (entry), the index calculation function (index), the tag calculation function (tag) and the data eviction policy (policy).Their meaning passes current among microprocessors designers.Here comes a sample description of TLB.It is accessed by VAs.The keyword register means that the buffer is mapped to the registers and can be accessed from the ISA specifications.Processing of memory access instructions is specified by requesting the segments and buffers.The syntax is similar to nML though allows using such constructs as B(A).hit(the buffer B contains an entry for the address A), E=B(A) (the entry for the address A is read from the buffer B and assigned to E), B(A)=E (the entry E for the address A is written to the buffer B), and the like.

TPG Approach
The MicroTESK TPG approach is based on TTs written in Ruby [12].In general terms, the process is as follows [14].A TT describing a microprocessor verification scenario is given to MicroTESK.The tool processes the TT and builds a series of symbolic TPs, where abstract situations and dependencies (often in the form of constraints) are used instead of specific values.Each symbolic TP is instantiated with appropriate test data (TD).The resultant TP is supplemented with preparation code that initializes the registers, the buffers, and the memory.TTs are allowed to use modes and instructions defined in the specifications as well as special TPG constructs (blocks, situations, etc) [14].More technically, a TT is a subclass of the Template base class provided by the MicroTESK library.In the example below, MmuTemplate is a subclass of Mips64BaseTemplate, which, in turn, is a subclass of Template.The entry point is method run.This method declares a block of two instructions, LD and SD, to be processed with the dedicated memory engine.The situation access guides TPG by specifying constraints and biases for the MMU variables and buffers.The denotation reg(_) means any instance of the mode REG, i.e. any GPR.
class MmuTemplate < Mips64BaseTemplate def run block(:engine => "memory", ...) { ld reg(_), 0x0, reg(_) do situation("access", hit("L1"), ...) end sd reg(_), 0x0, reg(_) } end end Let us consider how TPG for MMUs is organized.Parsing specifications results in two entities: an interpreter, which is a part of the instruction set simulator (ISS), and a symbolic representation in the form of a labeled DAG.The DAG is traversed, and all possible execution paths are extracted.An execution path describes processing of a single memory request and finishes either with a memory access or with an exception (alignment fault, TLB refill event, etc.).Paths are composed of transitions.Each transition is supplied with a guard, i.e. a condition that enables the transition, and an action to be performed; it can also be labeled with a buffer being used in the guarded action.Here is a fragment of the execution path in MMU (see above) represented in a hypothetical language.
path PATH(va: VA) = (data: DATA_SIZE) transition { guard = TRUE action = {} // Go to TranslateAddress(va) } … transition { guard = L1(pa).hitaction = { l1Entry = L1(pa); line = l1Entry.DATA; } buffer = L1 } … Given two execution paths, the tool can extract possible dependencies between them.A dependency is a map from the set of buffers common for the given paths to the set of conflict types.More formally, let and be execution paths, be a non-empty set of conflict types, and be the set of buffers used in a path .A dependency between and is a map : ∩ → .The set is supposed to include the following elements and their negations:  indexEqual -access to the same set of the buffer;  tagEqual -access to the same entry of the buffer;  tagEvicted -access to the recently evicted entry.
Given a TT, symbolic TPs are systematically enumerated.The main, but not the only, approach supported by MicroTESK is combinatorial generation.Symbolic TPs are constructed by selecting all relevant execution paths for the TT's instructions and producing all satisfiable dependencies for each combination of the paths.To avoid combinatorial explosion, special heuristics are used, including factorization of the 106 paths and limitation of the depth of the dependencies.Among them, a buffer-event factorization is frequently used.Let be a path, and ∶ → , be the induced map of the buffers to the events.Two paths, and , are equivalent, if and for each ∈ , holds.During TPG, the equivalence classes are enumerated, while their representatives are randomized.
Symbolic TP is a pair 〈 , , 〉, where is an execution path, and is a dependency between and .To produce a TP from a symbolic TP, appropriate TD are required, including addresses of the instructions, entries of the buffers being accessed (except replaceable ones, such as caches) and sequences of addresses to be used to load or evict data to or from the replaceable buffers.Formally, TD are a tuple 〈 , , , 〉, where is an address of the type used in the path , is an entry of the buffer accessed by the path , , is a sequence of addresses to load data to the set of the buffer and, finally, , is a sequence of addresses to evict data from the set of the buffer .
Here is an approximation of the TD generation algorithm implemented in MicroTESK's memory engine.The following denotations are used: : , is the minimal , such that 1 and , or a special value ∉ if there are no such ; is equivalent to , where is the address type of the buffer ; and are, respectively, the tag and the index extracted from by using the corresponding functions of the buffer ; , , . . . is an address constructed from , , and, probably, some other information; , is an empty entry of the buffer with specified and ; given a buffer , its state , and , , is a tag to be evicted.Other functions will be briefly explained further below.exploits several auxiliary components: , , and .performs symbolic execution of a given path and constructs required entities (addresses, entries, etc.) by calling constraint solvers.Interface with solvers is provided by Fortress library [15].It supports SMT solvers, such as Z3 [16] and CVC4 [17], as well as in-house solvers aimed at particular tasks.Allocator Kamkin A.S,, Kotsynyak A.M. Specification-Based Test Program Generation for MIPS64 Memory Management Units.Trudy ISP RAN /Proc.ISP RAS, vol. 28, issue 4, 2016, pp. 99-114.108 chooses buffer indices, tags and other address fields taking into account user-defined constraints (e.g., forbidden memory regions).The default strategy is to allocate a new index or a new tag for a given index on every request.This allows avoiding undesirable dependencies between instructions.
simulates accesses to buffers and predicts data evictions.The results of the predictions are used to satisfy conflicts.prepares a sequence of accesses so as to fulfill and requirements.The default strategy is as follows.Buffers are handled in reverse order; for every buffer and every set , , and , are added to the sequence.Finally, TD are transformed to the ISA-specific preparation code.For this job, the tool needs to know what instructions have to be used to set up addresses and entries.Such information is provided in TTs in the form of so-called preparators.Technically, a preparator is a piece of code that defines a sequence of instructions to reach a certain goal.Given a register type (to be more precise, an access mode), there usually exists a family of preparators differing in patterns of loaded values.For example, Mips64BaseTemplate contains the following preparator for loading a 32-bit value into a GPR via the mode REG.

MIPS64 MMU Case Study
The most challenging part of developing a specification-based TPG tool for a microprocessor is MMU specification.Speaking of MIPS64, the following things have been specified [2]: address spaces, a TLB entry format, and an address translation procedure.Additionally, we have described a two-level write-through cache memory.MicroTESK's MMUSL has allowed specifying MIPS64 MMU in quite a compact way (approximately 220 lines of code).The specifications involve a TLB (JTLB and DTLB), two-level cache memory buffers (L1 and L2) and memory segments (kseg0, kseg1, xkphys, and useg).On the base of the ISA [18] and MMU [2] specifications, 18 memory access instructions and several auxiliary instructions to access the TLB and the cache have been defined.Description of a single instruction makes up approximately 10 lines of nML code on average.

Number of Execution Paths of an Instruction 76
Table 1 contains numeric data on MIPS64 MMU execution path complexity.While the complexity is relatively low (average path consists of less than 40 transitions and comprises less than 50 variables), only very short TTs can be processed by exhaustive enumeration of symbolic TPs.In more complicated cases, heuristics become of crucial importance.E.g., the buffer-event factorization gives only 9 path equivalence classes, enabling systematic enumeration of longer sequences of memory accesses.Generation of more complicated TPs is done with the help of constrained random generation.This requires verification engineers to explicate their knowledge in the form of constraints and biases.This is an ongoing project, and some useful information, such as test coverage, is not available at the moment.Though it is worth considering the lessons learned.We found it convenient to use domain-specific languages (DSLs) for specifying ISAs and MMUs.The use of DSLs, first, eases extraction of testing knowledge and, second, simplifies learning of the TPG tool.On the other hand, it seems that dynamic programming languages, such as Ruby and Python, suit well for describing TTs.Such languages can be easily extended with TPG constructs.Our negative experience is mostly connected with low performance of the tool.Constraint solving needs to be optimized.As the authors of [19], we believe that specialized solvers will help.The work is still in progress, and a number of things need to be done.The most priority task is a performance optimization of the constraint solving.Another task is to extend the approach to multicore designs and multiprocessor systems.The main challenge here is to create a unified technology that would include formal verification of cache coherence protocols, unit-level verification of MMUs, and system-level TPG.
Here is a fragment of the MIPS64 MMU specification.It contains two attributes, read and write, which, respectively, define logic of loads and stores.

Table 1 .
Complexity of MIPS64 MMU Specification [10]in A.S,, Kotsynyak A.M. Specification-Based Test Program Generation for MIPS64 Memory Management Units.Trudy ISP RAN /Proc.ISPRAS, vol.28, issue 4, 2016, pp.99-114.TPG is a widely-accepted approach to microprocessor verification, including, in particular, MMU verification.State-of-the-art MMUs are extremely complex devices comprising multi-level address translation and caching.Naive approaches to automated TPG for MMUs -meaning, first of all, random generation techniquesare highly improbable to reach high level of test coverage in reasonable time.Specification-based TPG, in our opinion, is one of the most promising directions in the area.Since 1990s, it has been successfully applied to microprocessor testing and verification, e.g., in IBM[3], and it continues to evolve.The MicroTESK team[7]contributes its mite to the evolution of the specificationbased approach.Our goal is to create an open-source, extensible and reconfigurable TPG framework [13],[14].Different versions of MicroTESK, including the one described in[10], have been applied to several industrial microprocessors and allowed to reveal a large number of critical bugs, which had not been detected by randomly generated TPs.The proposed solution is based on ISA specifications in nML[8]and MMU specifications in MMUSL.ISA specifications formally describe microprocessor instructions, while MMU specifications define memory segments and buffers.MicroTESK is able to automatically extract testing knowledge from the specifications and to exploit it for TPG.TTs are created with the help of Ruby[12].To generate TD, symbolic execution and constraint solving techniques are intensively used.