Vous êtes sur la page 1sur 9

INCORPORATING PROGRAMMING COMPETITION PARADIGM INTO

THE INSTRUCTIONAL DESIGN FOR TEACHING COMPUTER


PROGRAMMING

Tema: Integrasi Dalam Pendidikan

Abdullah Sani Bin Abd. Rahman


Computer and Information Sciences Department
Universiti Teknologi PETRONAS
Seri Iskandar, Perak.
sanirahman@petronas.com.my

Suraya Binti Masrom


Fakulti Teknologi Maklumat & Sains Kuantitatif
Universiti Teknologi MARA
Seri Iskandar, Perak.
suray078@perak.uitm.edu.my

Abstract

Computer programming is considered a fundamental subject in most of engineering and computer


related programmes in many universities. Usually, the teaching responsibility is given to a single
department, which will service the entire university. The limit on staffing will most certainly result in the
creation of big-sized classes. This in turn, will produce voluminous workload to the teaching staff
hence rendering close monitoring of students’ performance next to impossible. On the other hand,
plagiarism in students’ assignments has become rampant that it hinders lecturer’s efforts to correctly
gauge students’ progress. In this paper we propose to incorporate a Computer-Assisted Assessment
system, designed originally for a programming competition, into the instructional design for teaching
computer programming. Control mechanisms implemented in the system will help curb plagiarism and
the simplicity of the system architecture warrants its smooth and easy deployment. We have tested
the system in both the competition and programming laboratory environment. We present in the last
section of the paper, our findings regarding the feasibility, scalability and accuracy of the system as
well as its limitations.

Keywords:

Instructional Design, Computer Programming Competition, Computer-Assisted Assessment

1.0 Introduction
The increasing number students in programming classes(Hidekatsu, Kiyoshi, Hiko, & Katsunori, 2006;
Hussein, 2008) warrants the use of computer supported systems in relieving lecturers’ academic
tasks. Student assessment has become an issue because of the large number of students involved.
For a fundamental course such as computer programming, it has become necessary to be able to
oversee students’ weekly progress to make sure that every students is at par with one another.
Realizing of the difficulties involved, Computer-Assisted Assessment (CAA) as explained in (Janet et
al., 2003), might be able to help reduce the marking chores and results management. As a
consequence, the number of assessment can be increased gradually and student performance can be
monitored very closely.

We believe the incorporation of computer programming competition paradigm into teaching


environment will be beneficial to both lecturers and students. There are three aspects of a
programming competition that seem very appealing to us. The first involves the psychological setup;
contestants are working under pressure and they are struggling to outperform one another. The
second aspect being the control mechanisms; contestants are isolated from one another and have no
one to help them, except for their team-mates when they are not competing at individual level. Lastly
the assessment criteria are different compared to regular tests or assignments; they are evaluated on
their ability to complete fully functional and adhering to specifications programs in the shortest time
possible.

We are also interested in investigating the feasibility, scalability of the system if it were to be widely
implemented. The performance as well as accuracy of the system output will also be evaluated. While
some measurement can be obtained direct from the running system other more subjective measures
will be evaluated through users perceptions.

2.0 Related Work


a. The development of CAA

According to (Christopher, David, & James, 2005), the generations of assessment systems can be
divided into Early Assessment Systems, Tool-Oriented Systems and Web-Oriented Systems. We
classify our CAA system as Tool-Oriented Systems that are developed using pre-existing tool sets and
utilities supplied with the operating system and programming environment. An example of Tool-
Oriented System can be seen in the work of (David & Michelle, 1997) that introduces a scheme that
analysed submissions across several criteria. The system was named as ASSYST and has capability
to analysed the correctness, efficiency, complexity and style of a program. The BOSS system (Mike,
Nathan, & Russell, 2005) that is similar to ASSYST, ran on the Unix operating system and used for C
programming assessment. The latest version of BOSS facilitates JAVA GUI application for the tutor
grading and assignment management. Michael (Michael, Steven, Ann, & Vallipuram, 2004) details a
system called GAME for grading variety of programming languages by comparing the program outputs
with marking scheme written in XML scripting. The system can examine program structure and
correctness of the program’s output. It has also been tested on a number of student programming
exercises and assignment. The analysis of comparing human marking with GAME system provide
encouraging results.

b. Programming Competition Paradigm

According to (James, 2008), the student chapter of the ACM at the College of Charleston hosts
programming competitions for high school students every year. Current programming competitions
assessment method stresses on completing and running program rather than the structure or any
basic elements of a program. According to him these other evaluation aspects merit consideration as
well. The concept of black-box testing has been extensively used in many programming competition
judging process. Black-box testing takes an external perspective of the test object, while white-box
testing evaluate the object’s internal structure. James has introduced new paradigm of programming
contest in his paper (James, 2008) that involve incorporating both technical and artistic criteria in the
assessment process.

3.0 Approach
3.1 System architecture

a. Laboratory setup

The architecture of physical laboratory setup is depicted in figure 1. The setup consist of a number of
computers linked together with a network switch. There is also a need for Server to host a File
Transfer Protocol (FTP) application server. Each user or team will be allocated an account to the
server. Students or contestant will have to submit their programs to be stored on the server upon
completing them. We have also made the necessary configuration so as to isolate the participant from
each other even though they are connected to the same network. Access to outside network such as
the Internet has also been disabled.

Figure 1: LAN Setup

b. The automated process

Some tools have been used to support automated processing such as data processing interface,
program testing and optionally program benchmarking. The overall automated processing is depicted
in figure 2.

Start

Retrieval of source code from users account

Extraction of files timestamps

[ Optional ] Compilation

Performance test Functional test

benchmark acceptance

Result compilaton Result compilaton

End End

Figure 2: System flow of automated process


c. Functional test

The main reason of functional testing is to verify the program output send by student. The verification
process can be done by manually view the output files or by performing similarity test, which has been
illustrated in figure 3. The tested program will be accepted if passess the similarity test. We determine
the acceptance by comparing the similarity value with the set treshold. In our experiment the treshold
value has been set from 50%-80%. The higher the treshold value the more the rigid the similarity test
becomes.

User’s program

Acceptance /
Treshold value Similarity test Rejection

Answer scheme

Figure 3: Block Diagram for Similarity Test

d. Performance test

Performance measure is collected through the testing process shown in figure 4. It is however, not
always necessary to run performance tests unless the problem given to the students requires complex
functions in which case the judges will be interested to evaluate the efficiency of the implemented
functions. It is impractical to compute the Big-O complexity(Arefin, 2006) for each of the programs
given the limited assessment time. The performance measure is therefore the closest yardstick to
gauge the elegance and programming style of the contestants (or students).

Source code Profiling


option/switch

Compiler

Instrumented executable code

Load and execute


the code

Profiling data
Profiler Performance
report

Figure 4: Block Diagram for Performance Test


e. Control mechanism

In line with the practice in any programming competition, we only allow a one-time submission for any
programming problems. After the submission of any particular program, the system will lock the file,
preventing any further modification. Students will have to test and evaluate their own program
thoroughly before submitting the answer. Whenever a program is submitted, the system will record its
submission time. This mechanism allows the calculation of time taken to complete (TTC) the program.
The assumption is that we make is that every students work on one problem at a time. Submission of
one solved problem therefore mark the beginning of work on the following program to be submitted.
The calculation of time taken is still relevant even if the students are given a number of questions at a
time. Their decision on which problem to solve first does not affect the calculation of TTC. Security
mechanisms employed in the FTP server limit student’s access only to their dedicated folder. This will
prevent any attempt to sabotage other competing teams or any attempt to plagiarize other teams’
work. The networking equipments are also configured to allow access only from clients to servers.
Client to client communication and client to Internet communication are blocked in order to maintain a
totally isolated environment.

3.2 Student assessment method

We have divided the assessment process into three steps:

a. Question design and selection

The questions are designed in such a way that weekly progress can be evaluated easily. Each
questions will be tailored for specific learning objectives/topics. To distinguish between good and
average students, however there will be questions that regroups several topics at once. These
questions will be stored in a dedicated questions banks and will not be provided to students except
during the administration of the tests. The simple reason behind this is to make sure that future
students will not have already been exposed to the same questions hence rendering further
experiments invalid or biased.

b. Selection of evaluation criteria

Three criteria have been selected for the evaluation of the student programs: functionality, time to
complete and program performance. Functionality is measured simply by conducting a black-box
testing of the program. If the program conforms to the set requirements it is accepted otherwise it will
be rejected (Hussein, 2008). Time to complete is obviously the duration taken by the students to
complete a functional/acceptable program. It is recorded once a student submits a program to the FTP
server.

c. Students performance

Since the system is adapted to teaching environment we have found it necessary to introduce two
derived measures: competency and proficiency.

We define competency as a measure that link learning objectives/topics to the validity of the program
that they write. A program is considered valid only if it successfully compiles, runs and produces
output that has a similarity value greater to the set threshold. When a program passes the validity
check we record the mark in the respective student record. Proficiency is define in this context as the
ability to produce the correct program within a reasonable time. In certain cases we also link
proficiency with the ability to produce elegant solutions within reasonable time. Reasonable time is in
turn defined by the average time-to-complete (any specific program) of the whole class or student
batches. While there might be more complicated measure of elegance in programming, we limit
ourselves in this work by relying on the performance report of each of the program.
Table 1: Student’s competency by topic

Student Name
Lab
Topic
test Run Similarity Total Marks
(0 / 1) (4*) (5)

1 Basic data types and arithmetic 1 4 4


operators

2 Sequential control structure 1 2 2

3 Selection control structure 0 0 0

4 Repetition For 1 1 2

5 Repetition While 0 0 0

6 Functions 0 0 0

*weighted score

Table 2: Student’s proficiency by topics

Student Name
Lab
Topic Time to Total
test Run Performance
complete Marks
(0 / 1) (4*)
(5*) (10)

1 Basic data types and 1


arithmetic operators

2 Sequential control structure 1

3 Selection control structure 0

4 Repetition For 1

5 Repetition While 0

6 Functions 1

*weighted score

3.3 Users’ evaluation of the system

We have successfully implemented the system and have had our students used it for a period of four
months. At the end of semester, we conducted a study on the students perception towards the
incorporation of the new tool in their learning environment. We have used some similar questions as
was done in (Hussein, 2008). The purpose of this survey is to gather students’ view on several
aspects such as the fun element, self-awareness, discipline, motivation, accuracy of the system in
marking and the overall impact on students learning experience.

The subjects of our study was a class of 30 students undertaking a computer programming class. All
of them are Part 2 Quantitative Science (CS113) students. The subject taught was Introduction to
Computer and Problem Solving (using C++). Total coursework marks allocated for the subject is 40%,
and 20% of it comes from the lab test conducted using the new system.
4.0 Results and Analysis

4.1 Students’ perception

Thirty students were randomly divided into ten groups of programming teams. They were given ten
questions of medium difficulty and asked to answer as many as they can within a period of two hours.
They were given the report generated by the system at the end of the lab session and were asked to
look at their answers and compare the results obtained with the other teams. Finally each one of them
was given a questionnaire to complete as shown in table 3.

Table 3: Results from student survey

Question N 1 2 3 4
1. Do you find the system enjoyable to use in a learning 2 1 1 18 8
environment?
2. Rate how the system increases your motivation in 0 0 1 19 10
learning computer programming?
3. Do you find the marks given by the system fair? 0 2 2 15 11
4. Rate how the system helps you identify your 0 0 0 12 18
weaknesses in programming.
5. Rate how the system helps increase your team spirit. 3 0 0 8 19
6. Rate your overall experience 0 0 4 19 7

N=”No answer”, 1=”Poor”, 2=”Below average”, 3=”Good” and 4=”Excellent”

The survey result shows that 86.6% of the student found the experience enjoyable. One student did
not like the experience at all while two other students did not answer. 96.6% of respondents conclude
that the system helps increase their level of motivation in learning computer programming. All except
one student agreed the marks awarded by the system were fair. The result also shows that 100% of
the respondents think the system could help them pinpoint their weaknesses in computer
programming. All except three students find the exercise helps them increase their teamwork. Finally
86.6% of the respondents agree the overall experience was good.

4.2 Performance of the system

We were interested to know how the system will perform in an actual teaching environment. We have
therefore selected 10 programming questions of average difficulty. The instruction given to our
students was to answer correctly as many as they can, as fast as they can. Below are the details of
the setup that have been used in the performance test.

- Computer: Intel Core2 Duo 2GHz


- Memory: 3 Giga Bytes
- Compiler: GCC v3.2.3
- Profiler: gprof v2.13.90
- Similarity function: sim_text
- Number of students: 30
- Number of programs: 190
- Number of program lines: 6494
- Average number of lines per program: 34
Table 4: Results from performance test

Compilation time 84.00 sec


Compilation time (instrumented) 85.00 sec

Loading, Execution and report generation 25.65 sec


Loading, Execution and report generation (instrumented) 30.40 sec

Similarity checking 6.36 sec

Total processing time 116.01 sec


Total processing time (instrumented) 121.76 sec

Average processing time / program 0.61 sec


Average processing time / program (instrumented) 0.64 sec

The test conducted has shown that any instructor would require typically less than four minutes to
complete the assessment process given a class of thirty students. Even if the number increases to
forty students, one would require only about six minutes to complete. This is of course a big leap
compared with the old way of manual marking.

We have not considered the delay taken for the transfer of the programs to the server. The reason is it
happens before the assessment process; therefore it does not affect the performance of the
assessment system.

Given the setup and resource requirements to implement the system, we can conclude that it is
indeed a scalable system. The same setup can be replicated in other labs to achieve the same level of
productivity.

5.0 Conclusion and Future Work


We have shown through our small setup, the feasibility of a simple and lightweight system to be
implemented in any computer programming laboratory provided there exist a minimum number of
networking devices to link the computers together. We have also demonstrated that the assessment
process does not require a lot of computing power and takes only minutes to complete. Scalability is
not an issue because the same setup can be replicated into other laboratory to achieve the same level
of productivity. Through experiments with our students, we can also conclude that the system can be
regarded as fairly accurate since the majority of students agree that the marks given were fair and
meet their expectation.

Through the experience we must also acknowledge that the system is only capable of assessing the
functional aspect of the program given the verification method employed here is a black-box test. We
are thus unable to measure aspects such as elegance of the program, except when it is tied to its
performance. We are also not able to test programs behaviour when it come to handling illegal input
and exceptions.

Future work can be directed towards improving the similarity checking function, to allow more flexibility
when comparing outputs of the programs with the answer scheme provided by the instructors. A better
similarity checking function can also be used in detecting plagiarism in the computer programs
submitted to the system.
References

Arefin, A. S. (2006). The Art of Programming Contest (2nd Editon) (Special Online Edition ed.):
Gyankosh Prokashoni.

Christopher, D., David, L., & James, O. (2005). Automatic test-based assessment of
programming: A review. J. Educ. Resour. Comput., 5(3), 4.

David, J., & Michelle, U. (1997). Grading student programs using ASSYST. Paper presented at
the Proceedings of the twenty-eighth SIGCSE technical symposium on Computer science
education.

Hidekatsu, K., Kiyoshi, A., Hiko, M., & Katsunori, M. (2006). Using an automatic marking system
for programming courses. Paper presented at the Proceedings of the 34th annual ACM
SIGUCCS conference on User services.

Hussein, S. (2008). Automatic marking with Sakai. Paper presented at the Proceedings of the
2008 annual research conference of the South African Institute of Computer Scientists and
Information Technologists on IT research in developing countries: riding the wave of
technology.

James, F. B. (2008). A new paradigm for programming competitions. Paper presented at the
Proceedings of the 39th SIGCSE technical symposium on Computer science education.

Janet, C., Kirsti, A.-M., Ursula, F., Martin, D., John, E., William, F., et al. (2003). How shall we
assess this? Paper presented at the Working group reports from ITiCSE on Innovation and
technology in computer science education.

Michael, B., Steven, G., Ann, N., & Vallipuram, M. (2004). An experimental analysis of GAME: a
generic automated marking environment. SIGCSE Bull., 36(3), 67-71.

Mike, J., Nathan, G., & Russell, B. (2005). The boss online submission and assessment system. J.
Educ. Resour. Comput., 5(3), 2.

Vous aimerez peut-être aussi