Applying the methods of system analysis to teaching assistants’ evaluation

This article presents the results of applying various methods of system analysis (CATWOE, Rich Picture, AHP, Fuzzy AHP) to evaluation of teaching assistants. The soft and hard methods were applied. Methods of system analysis are considered as an example at the Higher School of Economics (HSE) in program “Teaching assistant”. The article shows the process of interaction of teaching assistants with students and faculty in the form of Rich Picture. Selection and analysis of criteria for the evaluation of training assistants are carried out. Three groups of criteria were defined: professional skills, communicating skills, personal qualities. Each group has some subcriteria, which were defined in brainstorm. Its own method was determined, which immediately allow drop some assistants. In addition, the application of the methods AHP and Fuzzy AHP type-2 to evaluate teaching assistants is considered. The strengths and weaknesses of each method are revealed. It is also shown that, despite the power of the methods of system analysis, it is necessary to use common sense and logic. Do not rely only on the numbers obtained by the methods of system analysis. In the process of work, the best teaching assistant is selected, and the group of the best teaching assistants is defined.


Introduction
At the Higher School of Economics (HSE) there is a program "Teaching assistant" which has been effective for sever-al years. Each teacher can invite an education assistant, who will take some of the routine tasks related to teaching the course (checking homework, developing test materials, etc.). Every student or a graduate student of the HSE, who meets the criteria established by the faculty, can be a teaching assistant. The teacher (or group of teachers) formulates Beresneva E., Gordenko M. Applying the methods of system analysis to teaching assistants' evaluation. Trudy ISP RAN/Proc. ISP RAS, vol. 30, issue 3, 2018. pp. 251-270 252 tasks for the teaching assistants and monitors the quality of their performance. The teacher is responsible for the results of the students' knowledge, the quality of materials prepared by the education assistant, methodical support of the teaching assistant' work. At the moment, all faculties establish their own criteria for selecting teaching assistants independently. Now there is only one criterion for all disciplines: "A student must have a mark at least 8 on the course in which he/she is involved, or he/she must have a recommendation from the department, to which teaching of this discipline is fixed." However, the practice shows that it is not enough to have only this criterion. There were no special studies about it before, but annual evidence showed that an excellent mark does not fully correlate with being a good teaching assistant. Recent year revealed that 60% of assistants were not able to cope with their work according to teachers. Most problems were connected with personal qualities, professional and communicative skills. For example, somebody did all the tasks slowly and did not do everything in time, or just did not have enough knowledge in the subject area. There were even some facts of disclosure of confidential information: one teaching assistant shared answers to the tests with students. Thus, there is a strong necessity to define a group of selective factors in a clever manner. Recently, the head of Computer Science faculty has ordered each teacher (or group of teachers) on all disciplines to choose the best teaching assistant to give him/her an incentive award. In addition, next year the number of students is reduced, and it is necessary to decrease the number of assistants. Now there is a tendency on «Discrete mathematics» course that the education assistants who come from year to year are the same. This situation prompted the idea that at the moment when assessing teaching assistants, it is worth using additional criteria that will allow the group of teachers to select the best assistant and choose the group of the most successful assistants. Thus, two tasks are faced -to choose the best assistant on «Discrete mathematics» course and to select the group of the most successful assistants, with whom it is possible to continue working on this course. The purpose of this work is the development of searching method, which will select the best assistant and select the group of the most successful ones according to the criteria set by the group of teachers. The rest of the paper is organized as follows. We discuss the problem specification in Section 2 and introduce our premises for model, which we use to illustrate our main results on Section 9. Sections 3, 4 and 6 present the different methods used for solution the problem. In sections 5 and 7 the derivations for the AHP and Fuzzy AHP are dis-cussed. Section 8 presents a sensitivity analyze.

The Difference between Previous Works and Our Approach
The literature review shows that there are a lot of researches that reveal a high success of applying the teaching assistant program in general. The most recent one is [3].
However, no one article is aimed neither at selection criteria for teaching assistants nor at searching methodology. The closest study to our problem is devoted to a proposed framework for evaluating student's performance [4]. This work is based on the hard approach only. It uses the variation of the most widely used approach for multi-criteria decision-making -Analytic Hierarchy Process that combines mathematics and expert judgment. Since Analytic Hierarchy Process suffers from the problem of imprecision and subjectivity, their paper proposes to use Fuzzy AHP instead of traditional method. However, there is an opinion about useless of applying Fuzzy AHP method. In [3] it is said that "the numerical representation of judgments in the AHP is already fuzzy" and "making fuzzy judgments more fuzzy does not lead to a better more valid outcome and it often leads to a worse one." Our article proves that Fuzzy AHP with type-2 modification can still be used in a decision making process. Moreover, our study combines both hard and soft approaches be-cause this problem consists of not only main criteria but also it has a lot of additional ones. And these auxiliary factors can not be described using only formal algorithms.

Problem Definitions
The problem of finding the best teaching assistant and the group of teaching assistants is closely related with searching the criteria by which the teaching assistants should be selected. To analyze the domain and determine its boundaries, the rich picture can be applied. Rich Picture is a collection of sketches, pictures, photos, symbols, signatures which represent a particular situation or a question of the real world from the point of view of the person or group of people who create it. Image components are people (stakeholders), systems, processes, inter-faces, data streams, information sources, infrastructure objects, attendant and impeding factors, emotions, points of view and attitude to them, etc. Rich Picture can reflect the interaction and connections of the system components (or the surrounding world), their influence, cause and effect. It can also represent such subjective elements as attitude (perception), point of view, prejudice [1]. It is used to explore and aggregate the physical, conceptual and emotional aspects of the actual situation (sys-tem/problem/need). Rich picture on subject «Teaching assistants» interactions in discipline «Discrete mathematic» is provided in Fig. 1. To analyze the subject area and project boundaries, the CATWOE technique is a good addition to Rich Pictures. CATWOE is defined by Peter Checkland as a part of his Soft Systems Methodology (SSM). It is a simple checklist for thinking. CATWOE is an acronym, each letter Beresneva E., Gordenko M. Applying the methods of system analysis to teaching assistants' evaluation. Trudy ISP RAN/Proc. ISP RAS, vol. 30, issue 3, 2018. pp. 251-270 254 stands for a specific word: Clients, Actors, Transformation, World view, Owner, Environmental constraints [2].

Role Description
Clients Teachers who want to assess their teaching assistants. Students who need assistants' help.

Actors
Groups of teachers who interested in evaluating of skills of teaching assistants and choosing the group of the best teaching assistants. The head of faculty who wants to encourage the best teaching assistant.

Transformation
Teaching assistant receives points for certain evaluation criteria.

World View
It is needed to define a group of the best teaching assistants and the best teaching assistant. The definition of a group of best teaching assistants is necessary in order to reduce the risks associated with incompetent and disinterested teaching assistants with the next year group of teaching assistants.

Owner
Teachers and the head of faculty.

Environmental constraints
National educational and assessment standards.
After analyzing the processes and interactions associated with the members of the system, a clear understanding of the subject area is emerged. There are three teachers: one lecturer (the leading teacher) and two seminarians at "Discrete mathematics" course. They compose a decision group for choosing best assistants. Fair and reliable evaluation results would be obtained by this group because its members have a strong relationship with teaching assistants during the course.
In order to evaluate the assistants, it is decided to come up with evaluation criteria. After the first brainstorm, the list of criteria is similar to a chaotic list of records. The next meeting of the teachers shows that some of the criteria identified in the first stage for assessing the assistants turned out to be duplicated or unnecessary. After long discussions and joint brainstorming, three main groups of criteria are identified: professional skills, communicating skills, personal qualities. The professional skills include the following sub-criteria:  active involvement in the process of forming the program of discipline;  initiative to compile new types of tasks for tests;  knowledge of the subject domain;  quality of homework checking;  speed of homework cheking;  experience of active use of the LMS; The communicating skills include the following sub-criteria:  pedagogical experience, the ability to correctly present information;  openness to student issues (e.g. quick response to questions, competent answers);  participation in counseling sessions before the tests and examinations;  active communicating with teachers, participation in weekly meetings;  the ability to listen carefully.
The personal qualities include the following sub-criteria:  ethical compliance;  punctuality;  self-motivation, the desire for development;  responsibility for work;  teamwork skills;  subordination;  striving to achieve common results;  resistance to conflict situations;  the ability to generate new and innovative ideas;  the ability to compromise;  benevolence.
From the first group the next criteria are deleted:  active involvement in the process of forming the program of discipline. The Beresneva E., Gordenko M. Applying the methods of system analysis to teaching assistants' evaluation. And the next criteria are combined as they characterize the checking of homework and are closely interrelated:  quality of homework checking;  speed of homework checking.
From the second group the next criteria are deleted:  pedagogical experience, the ability to correctly present information. This ability can be learned. One of the goals of the " Teaching Assistant" program is the development of teaching skills;  the ability to listen. In our opinion, this parameter is almost impossible to estimate.
From the third group the next criteria are combined, because they are very interrelated and cannot be separated:  self-motivation, the desire for development;  responsibility for work; And the next criteria are deleted:  teamwork skills. It is related with the responsibility of work criteria;  ability to be subordinate. By default, the main person on the course is the teacher. This is necessary to understand at first;  striving to achieve common results. It is related with the responsibility of work criteria;  resistance to conflict situations. It is the responsibility of the teacher to resolve and prevent the emergence of conflict situations;  the ability to generate new and innovative ideas. This is not a paramount task of the teaching assistant. And the teaching assistant can work great, but do not come up with ideas, it's not scary;  the ability to compromise. The last word for the teacher;  benevolence. It is related with the ethical compliance and punctuality of work criteria.
The final elected criteria and subcriteria are shown in Fig. 2. All the criteria and subcriteria have their own identification numbers.

Exploring the alternatives
There are ten teaching assistants A, B, C, D, E, F, G, H, I, J on the course. We can reduce the number of evaluating teaching assistants after assessing the involvement of teaching assistants in educational process. We have 3 groups of criteria, consisting of 9 sub-criteria. In order to assess the involvement of assistants in the educational process, we did not use the values of the last three subcriteria (3.1-3.3). These sub-criteria refer to a group of personal qualities and cannot be regarded as involvement in the educational process. Then the involvement of the teaching assistant in the educational process for each criterion is evaluated, based on expert judgment. The results are presented in Table 2.  Let us understand which assistants are least involved in the work process, according to experts. Calculations of threshold equals to 3, 4 and 5 are shown in Tables 3, 4 and  5.
Beresneva E., Gordenko M. Applying the methods of system analysis to teaching assistants' evaluation.   Table 3 and 4 allow to identify teaching assistants who are least involved in the educational process. The Table 5 with threshold equals to 5 shows that no one from H, I, J is not indispensable.
Thus, it is decided not to consider further the last three teaching assistants (H, I, J). However, little involvement in the educational process has its own explanations:  H was ill two month;  I was out of connection;  J decided to switch to another faculty. Preparation for the exams took all his spare time.
Thus, seven candidates are remained. It is difficult to find the best one because each of them is successful in one or more criteria. Stakeholders are about to choose A as a winner because this assistant took part in all teacher meetings and he suggested new types of tasks for tests so regularly (approximately once every two weeks). Assistant A communicated with teachers a lot (flashed before their eyes), that is why they prefer him. However, this decision can be too unfair, that's why multicriteria decision making (MCDM) prosess is decided to be applied.

Analytical Hierarchy Process
Analytic Hierarchy Process (AHP) which is one of the most used MCDM approaches [3] is a structured multicriteria technique for organizing and analyzing complex decisions including many criteria. In this paper we use a classical AHP proposed by the author [4]. At the first step of AHP a model for the decision is developed. Experts break down the decision into a hierarchy of goals, criteria, sub-criteria and alternatives. After that, decisioners derive priorities (weights) for the criteria with respect to the desired goal. It is made in the form of pairwise comparisons using individual questionnaires. Since the evaluation criteria are subjective and qualitative in nature, it is very difficult for the decision maker to express the preferences using exact numerical values. That is why a special numerical scale [4] which consists of interpretation of linguistic terms is used (see Table 6). Results of comparisons of all experts are presented in the form of matrices (see Table 7). Before calculating the weights, the consistency of the comparison matrix is checked.
As a rule, only if consistency is less than 0.1, it considered as acceptable, otherwise the pair-wise comparisons should be revised. In this decision making process, all of them are less than 0.092 that shows answers are consistence. On the basis of Table 7 the final matrix is created by finding a mean between estimates of all experts (see Table 8). This metric is used because of solid decision to make all experts' voices to be equal. The matrix from Table 8 is used in order to calculate criteria priority weights. The same way as it was earlier, a pairwise comparison of all the sub-criteria, with respect to each criterion, included in the decision-making model, is made. Obtained results are shown in Table 9. 3.3. Self-motivation 12,011% Next step consists of deriving the relative priorities (preferences) of the alternatives with respect to each criterion. Overall priority weights of assistants are calculated by summing all local priorities. Final figures are shown in Table 10. Bar chart is built on the basis of overall preferences of the alternatives (see Fig. 3).

A Discussion on AHP Results
AHP analysis shows that the prompt decision of choosing A as the best assistant is totally unfair. Results reveal that experts did not take into account other important criteria that in general over weighted those, which were chosen at first. Another discovered problem of A is some of his/her estimates, which are the worst in comparison with others (for instance, criteria 3.1 and 3.2). This fact also decreases his/her chances to be a leader.
The main interesting point of results are the highest figures which belong to both two assistants B and D. Let's describe each of them. Assistant B cannot be named as a brilliant employee. Nevertheless, he/she has showed good stable work without having bad results in any of the activities during the course.
Despite not being the best in any of the criteria, B always was close to the leader. In the same manner as B, assistant D has shown quite strong results in technical and communicative estimates. In addition, D was on the top in the personal qualities criteria. He/she produces an impression of too self-motivated person and D was never late on any events. Result of D exceeds B at an inconspicuous gap of 0,3. Since experts make an arrangement on having no less than 2% advantage taking by the leader, such difference is admitted being not crucial for them. In addition, there is a problematic situation with evaluation of the five best assistants. Four employees can be determined more or less clearly (A, B, C, and D). However, Beresneva E., Gordenko M. Applying the methods of system analysis to teaching assistants' evaluation. 262 the difference between E and the closest competitor G is less than 1%, which is also insignificant.

Fuzzy Type-2 AHP
Since experts want to be more confident in fairness of their choice, we decide to apply another MCDM approach for purpose of aiming our goal. It is called Fuzzy AHP. In classical AHP crisp numbers are used, for pairwise comparison evaluations. However, in Fuzzy AHP, the linguistic variables are represented as fuzzy numbers instead of crisp. In this case a fuzzy logic provides a mathematical strength to capture the uncertainties associated with human cognitive process. Many researchers [5], [6] who have studied the Fuzzy AHP have provided evidence that it shows relatively more sufficient description of decision making processes compared to the traditional AHP methods. According to [7], the membership functions of type-1 fuzzy sets have no uncertainty associated with it. Type-2 fuzzy sets generalize type-1 fuzzy sets and systems so that more uncertainty for defining membership functions can be handled. That's why type-2 fuzzy logic is used.
A is called an interval type-2 fuzzy set if all μ = 1 [8]. Interval type-2 fuzzy sets are the most commonly used type-2 fuzzy sets because of their simplicity and reduced computational effort with respect to general type-2 fuzzy sets. For this reason, we use interval type-2 fuzzy sets. A trapezoidal interval type-2 fuzzy set is illustrated as 1. .2 [7]. Pairwise comparison matrices got from experts for AHP are directly applied for our needs in Fuzzy AHP. We introduce interval trapezoidal type-2 fuzzy scales of the linguistic variables (see Table 11). They represent a modified version of scales proposed by [8] and include intermediate values between the two adjacent judgments like in AHP. The priority weights of criteria (Table 12) and sub-criteria (Table 13) are demonstrated.

Discussion on Fuzzy-Type-2 AHP Results
Now, we see that assistant D has higher priority weigth than B and difference between them (2%) is suitable for experts. In addition, it can be noticed that E should be in the top five group, for sure (difference is also about 2%). Thus, Fuzzy AHP does not change ranks of alternatives but makes it clearer. It means that more reliable results are maintained since interval type-2 fuzzy sets can better represent uncertainties. It is important to note that, contrary to the common belief, the system does not determine the decision we should make, rather, the results should be interpreted as a blueprint of preference and alternatives based on the level of importance obtained for the different criteria taking into consideration our comparative judgments. In other words, the AHP methodology allows us to determine which alternative is the most consistent with our criteria and the level of importance that we give them. Taking this point into account, Sensitivity Analysis is used. It performs a "what-if" analysis to see how the final results would have changed if the weights of the criteria would have been different [9]. Let's start with a goal of finding the best teaching assistant. The first criterion has the highest weight in our results ( 50% . If we decrease its weight and proportionally Beresneva E., Gordenko M. Applying the methods of system analysis to teaching assistants' evaluation. 266 increase other weights then D will still be a leader. In this case D will have even more clear-cut victory. Otherwise, if we increase weight of this criterion up to 60% and more, then B will become a new leader. However, stakeholders come to one opinion that no one criterion should cost more than a half and they has highlighted that the first criterion (professional skills) should stay as the most important one. It means that weight of the first criterion should be in the next approximate range [33%; 50%]. Let's now tune proportions of the second and the third criteria. Calculations show that D can stop be a winner only and only if the third criterion will cost more than the second. Thus, this point was brought to expert discussion and they have unanimously decided that personal qualities (third criterion) should be appreciated higher than communicative ones. Another important note is change of proportions of subcriteria inside their criteria. There are no strong disputes about subcriteria weights, experts' opinions differ no more than 10%. In this case change of subcriteria preferences in that range does not influence on the leader. It means that there is no opportunity to have another leader than D by introducing small changes in current proportions of criteria weights. At the same time, there is a complex situation with choosing top five assistants group. Analysis shows that four assistants are determined clearly. They are A, B, C, and D. The fifth assistant can be either E or G.
Calculations reveal that position of assistant G is directly connected with the second criteria and if its weight is equal or more than 15% than G will be in top five group instead of E. However, now second criterion has only nearly 10%. Finally, after Sensitivity Analysis is done, next recommendations for the experts are given:  to choose assistant D as a winner;  to prolongate contracts with A, B, C and D;  to prolongate contract with E if experts think that personal qualities should be at least twice more important than communicating skills (finally, communicating skills should have a weight less than 15);  to prolongate contract with G, in other case.

Final Result and Conclusion
Taking into consideration recommendations mentioned above, group of teachers has decided to follow first two instructions. They have selected D as the best teaching assistant on the course of «Discrete mathematics». Also, they have prolongated contracts with D and A, B, and C assistants. The main important step now is to choose the fifth assistant. Before making a choice, experts decide to use a retrospective and to look through all methods that were applied earlier. Lecture of the course noticed that since A, B, C, and D assistants are already confirmed it means that nobody will be responsible for communication with students (answering questions, having consultations) because assistant F did it before. However, now there is a choice between either E or G. And in this case G demonstrates a clear superiority compared with others as he/she is one of the top in this kind of work. Finally, G is chosen. At the very beginning teachers wanted to choose assistant A as the best teaching assistant. However, the soft methods of analysis helped us to choose another assistant. Also, neither AHP nor Fuzzy AHP chose G teaching assistant as the 5th best assistant in the group. Only a sound logic helped us to do this. The application of methods of system analysis can help to make a decision but it does not make a choice for us. We should look carefully at the results of system analysis methods, but make a balanced and considered decision.