A Graduate School's Evaluation Practices

Fall
08
Fall 2010
Program Evaluation
Alyssa Geiger
Case Study: A Graduate School’s Language for the Degree Programs
Monterey Institute of International Studies

FA10 IPOL 8644
Table of Contents
Introduction……………………………………………………………………………………………..3-5
Background………………………………………………………………………………………….......5-6
Evaluation Practices and Procedures………………………………………………………………….6-9
Analysis…………………………………………………………………………………………………6-24
Literature Review…………………………………………………………………………………….9-10
Figure 1: The Assessment Loop
Measuring Language Learning Outcomes: best practices…………………………………………..10-17

Figure 2: Measuring Performance
Figure 3: Student Learning Outcome Assessment
Program Logic……………………………………………………………………………………12
Figure 4: Systems Model
The Logical Framework Matrix………………………………………………………………12-14

Figure 5: A Logical Framework Matrix for the Language in the Degree Programs...15-15
The Levinger Evaluation Design Matrix……………………………………………………........17

Figure 6: A Levinger Evaluation Matrix, Program’s Current Evaluation Practices...17-18
Analysis of Current Evaluation Practices…………………………………………………………..18-24
Recommendations…………………………………………………………………………………….24-26
Conclusion………………………………………………………………………………………………..27
Appendices…………………………………………………………………………………………….28-33
Bibliography……………………………………………………………………………………………...34
2
Introduction
No policy, program, project or person is perfect, so the rationale of engaging in program evaluation
should be an ongoing iterative process that is both natural and logical. Evaluation has the reputation
amongst many people as being worthless, something that could destroy a program or project by proving
that it is unsuccessful, when in fact, it actually determines a program’s worth and provides useful
feedback on how to improve it. By sounding out the word, e-value-ation, it becomes clear that it is an
extremely important process of providing accountability and a measure of impact to determine if value
added can be attributed to a given program or not. Evaluation is defined as “a social science activity
directed at collecting, analyzing, interpreting and communicating information about the workings and
effectiveness of social programs.”1 It provides decision makers with meaningful information so that they
can answer these essential questions: Are we doing the right things (strategy), whether we are doing the
things right (operations) and whether there are better ways of doing it (learning)2 so that they are able to
take the right course of action.
Evaluation is both an experimental and creative process that asks and answers questions specifically
tailored to each situation in order to provide meaningful information. It’s important to be aware of the
methods and procedures used to answer these questions, as well as the relationship stakeholders have to
the evaluation itself: who receives what, when, if and how will it be used and why? Clarifying the nature
and scope of different concerns, assumptions and perspectives of the parties involved enhances the overall
understanding of underlying issues so that they may be adequately addressed, the recommendations
useful and lessons learned meaningful.
Evaluation can occur at any stage of a project, program or policy life cycle, and is performed as an ex
ante evaluation or before implementation, as a formative evaluation during the implementation phase or
as a summative evaluation after a program’s maturity depending upon what the evaluation is trying to
assess. Different metaquestions can be asked throughout each stage in any of the three aforementioned
evaluations in order to inform us of the purpose of the evaluation. Good evaluation questions “identify
distinct dimensions of relevant program performance and do so in a way in which the quality of the
performance can be credibly assessed.”3 For example, does the program go beyond satisfying the target
group by actually evoking a change in the human condition, or is it ineffective? Is this change measured,
and how?
The first stage is problem identification and scope, or needs assessment, to find out where “dysfunction is
occurring in the in the target population or area, as it relates to the intervention.”4 Alternatively, the
problem can be positively stated: what does the program aspire to do, where does it want to be or what
would the future would look like.5 An evaluation at the design phase is concerned with quality and
soundness of the design, and asks what is the program trying to accomplish (intervention), when
(timeline), where, with who (stakeholders involved), how (resources) and what activities/actions are to be
implemented in order to achieve what end (its goals, results and outcomes). Maintaining a relationship or
fit between the design and values/concerns of the organization, such as participation, sustainability,
1
Peter H. Rossi et al., Evaluation A Systematic Approach (California: Sage Publications, Inc., 2004), 2
2
The World Bank Group, “Module 11. Building a Performance-Based Monitoring and Evaluation System.” International
Program for Development Evaluation Training. 8 (1994): 13
3
4
The World Bank Group, “Module 5. Impact, Descriptive, and Normative Evaluation Designs.” International Program for
Development Evaluation Training 8 (1994): 2
5
Notes, 9/2 class #2 of Dr. Beryl Levinger’s Program Evaluation
3
empowerment, equity, efficiency and evoking a change in the human condition, not just a change in
knowledge, is crucial to having the intervention or program adhere to the mission and vision of the
organization. Hence, evaluation and design are indeed two sides to the same coin. At the piloting stage,
the design is tested by asking process questions referring to resources, capacity, and organizational
structure as well as lower level outcomes to see to if the program will work as originally intended. Early
Expansion involves questions of maintaining quality throughout the scale-up paying close attention to
higher-level outcomes and whether or not inputs and outputs are sufficient enough to achieve program
goals. Last is the impact stage, which includes such questions: To what extent goals were achieved or
what impacts (intended and unintended) the program or intervention had? For each stage, it’s essential to
note the key stakeholders who are involved or need to be involved. Concluding the evaluation, a decision
is made to either: stop, revise, redesign, or to maintain the status quo. Program failures that influence this
decision include the right design for the wrong target group, the wrong design for the right target group,
the wrong design and wrong target group.
Depending upon the intention of the evaluation, different evaluations and approaches may be used. An
evaluation using performance-based or results-based monitoring measures a program’s progress towards a
specific result. Traditional monitoring and evaluation (M&E or preferably MEL, where L signifies
learning) monitors and evaluates programs continuously over time, for the sole purpose of “L” learning so
they can be improved. If the purpose of an evaluation were to assess a program’s theory, a process and
impact evaluation would be most helpful. Program theory is the logic that connects activities to intended
outcomes, justifying what the program is doing and why (process) and relating it to how it will lead to
desired results (impact). Process evaluation looks at “the internal dynamics of the implementing
organizations, their policy instruments, their service delivery mechanisms, their management practices,
and the linkages among these”6 as a way of verifying if a program did indeed perform as intended and to
the degree it was intended to. Impact evaluation looks at the extent to which a program is adequately
meeting the needs of its target group and whether the program is having the desired impact. Many
organizations rely on these evaluation practices to generate information that will help them improve and
add value to their programs. To what extent does School X use evaluation to measure the success of its
programs? In this paper, I will evaluate the current evaluation practices of School X’s Language for
Degree Programs using program theory and impact evaluation.
My initial interest in choosing School X’s Language for the Degree Programs as my case study was
sparked by a recent Language Studies survey in the spring of 2009, which addressed ongoing student
concern and criticism of the Language for the Degree Programs. By using process evaluation, I can
ascertain whether or not the program is effectively delivering the services to the target group of students,
and to what extent these services are producing intended outcomes. I interviewed several stakeholders in
order to gain a rich picture of how the program evaluation processes have evolved over time and to learn
how the processes have met their target group needs within the program.
The overall purpose of the evaluation is to give School X’s Language for the Degree programs an
overview of the state of their current evaluation practices and policies and to provide feedback and
recommendations on how they can improve their evaluation practices. Specifically, I would like to help
the program measure impact more effectively so that findings are conducive to increasing program
performance. It is my hope that this paper also highlights steps the program has already taken towards
measuring the learning outcomes associated with their program and to get them back on the agenda. In
light of the recent merger between School X and another school, evaluative practices could be potentially
streamlined. This process could result in the sharing of best practices to improve language programs and
increase the overall quality of education at both schools. I will “evaluate” the evaluation practices and
procedures of School X’s Language for the Degree Programs by asking the following metaquestions:
6
Peter H. Rossi et al., Evaluation A Systematic Approach (California: Sage Publications, Inc., 2004), 57.
4
1) To what extent is School X’s Language for the Degree Programs using appropriate evaluation
processes to obtain meaningful information that can help improve program performance?
2) To what extent are the programs’ findings conducive to improving program functions?
I will answer these metaquestions by using the following tools: a literature review, several interviews
with deans and language faculty of the program, best practices of language program evaluation, program
theory, impact theory, a Logical Framework, and the Levinger Evaluation Matrix. Throughout the
interview process, the snowball effect was employed to pinpoint faculty engaged with the program’s
evaluation process. The following documents illustrate the evaluation policies and procedures and are
included as appendices: 2007 Language Studies survey (end of every semester), School X’s Class of 2009
One Year Out survey, Language Studies survey Spring 2009 with existing data, and non-Western
Languages LS Portfolio Requirements-January 25, 2008 Draft. First, I will describe the background of
Language for the Degree Programs at School X. Second I will give an overview of School X’s evaluation
policies and procedures for the Language for the Degree Programs. Third, I will elaborate on important
evaluative processes for language programs in my literature review as well their best practices including
the Logical Framework and Levinger Evaluation Matrix. Fourth, I will analyze their evaluation policies
and procedures and report my findings. Finally, I will end with recommendations and a conclusion.
Background
Since the founding of School X, there had always been an emphasis on language learning and cultural
awareness. The school first started offering an MA in languages and literature in French and German. An
MA degree for Political Arts was added, following Russian, Japanese and Spanish, degrees in Education
and Chinese. There had been a shift from offering degrees in languages to year round degree programs
that had a requirement of 16-units of language, known as the Language in the Degree Programs. By the
late 1980’s the language requirement dropped to 12 units. School X’s Language for the Degree programs
remains rooted and “one” with the tradition of language learning at the Institute as well as an innovative
leader in international professional education.
School X’s Language for the Degree Programs offers mandatory language classes to policy and business
students7 in Arabic, Chinese, English, French, Japanese, Russian and Spanish at the 300-level or above.8
Model courses emphasize a specific topic to be discussed from cross-cultural and linguistic perspectives
and are offered in selective language sections, along with Directed Language Study. The Language for the
Degree Programs also offers an online diagnostic test where students can self-assess their own language
levels prior to applying or arriving at the school, as well as a placement test at the start of school that
assesses a students’ language abilities.
The goal of the programs is to have language integrated into the core curricula of the graduate degree
programs, providing students with linguistic and cultural knowledge (objectives) that will enable them “to
reach across cultural and linguistic barriers towards more global career opportunities”9 (purpose). The
Language in Degree Programs is generally content-based, focusing on “professionally-relevant content
areas such as politics, business, policy, environment, and social issues.”10 It emphasizes discussion,
research and the use of authentic articles and materials instead of textbooks to develop and improve
language and analytical skills. Assumed to be the experts, the language faculties have considerable
7 Students with extremely strong English skills who place out of the language requirement as well as students who want to learn
another language are not included in the programs target group. Nearly every native English speaker in the Policy and Business
programs are included in the target group.
8
Arabic is currently being offered at the 200-level and above due to increased demand and overall language proficiency levels. If
a student’s language is different from those offered, the program may be able to accommodate.
9
School X’s website:
10
School X’s website:
5
flexibility to set their own objectives and to decide on the content as well as assignments and assessment
methods for each course.
Evaluation Practices and Procedures
The evaluation practices and procedures for the Language for the Degree Programs are largely informal
and ad hoc, and are considered to be standard compared with those of other universities. They include
surveys at the end of each semester, some mid-term surveys, alumni survey questions, noting phone calls
from Alumni, enrollment records for the Language for the Degree Programs, including total number of
students and a breakdown of the number of students in each language program, monthly meetings with
language faculty, notes on written and oral complaints or praises, ad hoc interviews, an online diagnostic
test for student self-assessment, a placement test with both written and oral components, a monitoring
evaluation every seven to ten years, as well as a few past interventions which I will describe later in this
paper. The level of funding devoted to evaluation-related activities and the level of total budget the
resource allocation represents was not disclosed. Since their evaluation practices are largely informal and
ad hoc, the amount budgeted is likely not significant and therefore might not be easily accessible.
All students of School X’s Language for the Degree programs have the opportunity to take the online
diagnostic test in order to self-assess and estimate their language levels prior to applying or arriving to the
Institute. Results from this assessment may influence a student’s decision to come to the Institute or help
him or her decide whether or not they need to take additional language classes before they arrive. All
students are administered a placement test prior to the start of classes that assesses both their written and
oral language abilities. Language faculties administer each test and ultimately decide upon the appropriate
language level to place the student in depending upon the overall score from each test as well as how well
the student does on the test compared to other students in the same language. After this initial language
assessment, a students’ language ability is not tested again and is instead based solely on class
performance. Students still have the option to place out of their language requirement by taking an
additional test administered by the language faculty of their language, whereby they must demonstrate
language ability above the 400 level. Students who wish to learn a different language starting below the
200 level or who do not achieve a language level of 200 or higher on the tests must bring their language
level up by either taking custom language courses or tutoring during their studies or enrolling in a
summer intensive language program at School X. The students who meet these minimum requirements
make up the target group of the Language for the Degree Programs. Students may also submit a proposal
for Directed Study in their language provided that a faculty member agrees to work with them. It is
assumed students coming to the school want to study their second language, so the number of students
waiving out of their language requirement is close to around four per year11, which is very few compared
to how many students are enrolled in the language program.
The Records Office keeps track of enrollment numbers for the Language for the Degree Programs, which
reached 445 in 2007. It is preferable to keep the class size around twelve in order to make the best use of
faculty resources (addressing an upper administrative concern) and to ensure “good learning” is
happening. However, depending upon the popularity of the language and program resources, class size
can range from around six to seventeen. Five is set as an absolute minimum, but there have been cases
where the class convenes with as few as three.
Although some teachers design and collect their own mid-term evaluation surveys, all language classes,
language programs, student expectations and satisfaction, directed studies, and faculties are subject to
evaluation through a survey at the end of every semester (See Appendix A for the survey). The survey
itself has not undergone many changes for several years except for recently changing mediums from a
paper survey to an online survey administered by the Records Evaluation in 2008. The result of this was a
11
Interview 9/28/2010
6
noticeable drop in the response rate. It is administered to language classes at the end of the semester and
is not mandatory, but highly encouraged. Although it is unclear who first created the survey and how they
were selected, it was not student derived.12 The survey was designed to measure the actual outcomes of
the program in relation to its objectives. Some of the objectives for course set by the faculty “have to be
broad,” because of varying expectations and aspirations of each student on exactly where and how to
improve their own language skills. Faculty normally chooses an overarching topical area that can be
looked at from different standpoints so that students can choose and explore areas of interest. Maintaining
a balance that meets most of the students needs is critical. The standards set forth in the survey to meet
these needs and to determine how well the program is functioning are expressed using the Likert Scale in
frequencies ranging from 1 to 5, and include meeting a satisfaction level of around 4 “happy” or above,
“very happy” at 5. The survey questions are grouped by category: Course Organization, Instructors
Performance, Instructors Relationship with Student, Outcomes (specifically deepening mastery of subject
matter, overall contribution to my learning), My Contributions To My Learning, Grade I Anticipate To
Earn, Overall Rating Of Course, and Overall Rating Of Professor. Student comments about the classes,
including what they liked and did not like as well as suggestions on how it can be improved are also
considered significant indicators of the program’s performance. The style or method of teaching and the
newness of the course is considered if surveys for a given class are below an average of 4 to determine
“what needs to be fixed” or if these surveys receive exceptionally high marks to find out what is working
well.
The surveys over the past few years have indicated that the program is performing above average.
Surveys for each language class are compared with those from previous semesters in order to gauge the
teacher’s overall performance. The results of the survey for each class are shared only with the professor
or adjunct professor of the class and the Dean. The Dean reviews the survey results and provides input
and comments to the faculty and asks questions pertaining to the results in order to gain better insight into
the successes or problems within a given program or class. The Dean also listens to students’ vocal or
written complaints or problems and keeps notes on each student’s visit. It is estimated that out of seven-
hundred-and-fifty students, only about twenty vocalize their views on the program and these views are
predominantly negative. As this proportion is quite small compared to the number of students who are
likely have a favorable of the program, the Dean and the Program Chair tend to get a biased sample
regarding program feedback. The faculties interpret the results of the surveys and use them to guide
changes in their program or class based on what is working and what needs improvement. The Program
Chair does not receive and review the results from the surveys, however, the Program Chair plays a
central role in relaying information from the Career Advising Services to the Language for Degree
programs and vice versa. As an advisor, the Program Chair serves an invaluable role listening and dealing
with students’ complaints and problems with the language programs and classes to come up with
solutions. As a facilitator and arbitrator, the Program Chair attends advising meetings and relates
important feedback to faculty so that they know what to address. To get a better picture of overall
program performance, the program relies on the data from the surveys, which offer more balanced points
of view.
Alumni Relations sends an alumni survey one-year after students graduate, three times a year in May,
August and December. The survey questions seek to determine to what extent students are using their
language in careers and how it has contributed to their careers. The survey includes two questions that are
relevant to the language programs and was recently redeveloped using the Dean’s input (See Appendix B
for the Alumni survey). All of the language faculties are interested in the information from this survey, as
it reveals how important and/or relevant learning the language in the program actually is for the target
group. It is unclear if or to what extent the data from prior years serves as baseline data for determining
indicators to measure program performance.
12
Interview with a staff member 9/29/2010
7
In the Spring 2009 semester, there was an evaluation of the Language studies. As a monitoring
evaluation, the following program assumptions were tested: where the program is at present, areas of
strength and improvement, to what extent are student expectations, the extent to which the program itself
influenced their decision to come to the school and what the program can/cannot offer or should/should
not offer. There were no specific goals of the evaluation and therefore no specific outcomes or standards
with which to compare the results to. A Language Taskforce was created to design the evaluation and to
discuss its results. The Taskforce included faculty from all the different language programs, deans, the
provost, and faculty representing all the different degree programs (not necessarily program chairs). The
Language Studies Survey was an open-ended questionnaire consisting of three questions, each with three
sub-questions (Appendix C). The questionnaires were given out to classes of the different language
programs by non-language faculty. Professors of each class were asked to leave the room for twenty
minutes while the students filled out the questionnaire. In total, one-hundred-and-twenty-one
questionnaires were returned. A Dean volunteered to compile the data, to analyze the results and present
the findings to the Taskforce. The Dean read through 10% of the surveys and then coded the answers.
Once coded, the Dean hired a few work-studies to code the rest of the questionnaires so that the results
could be collected. The actual results from the Language survey confirmed what the program already
knew; none of the results seemed surprising. Challenges included balancing the language levels within
courses, making class projects relative to individual student goals, and maintaining flexibility with
faculty. They discussed options of what they could do if students want to learn another language and
came up with the decision to promote summer programs at other schools or their own summer intensive
programs and to pilot a 200-level language course for beginners.
One major intervention to bring about change in the program included pre-and post testing, designed to
measure the tangible learning outcomes of the language in the degree programs. The Dean as well as all
language program faculties would have access to the findings to improve upon subject areas on the test
that had poor results. The students would also use the scores to assess their improvements in the language.
The initial idea was discussed around 1997/1998; concerns were addressed and details for the design were
products of a workshop in 1998. It was piloted for one year in select languages at the start of 2000, and
tested the language level of the students before the first semester and after their last semester. The results
of the pre-and post testing proved inconclusive, and thus were never used. However, the failure of this
intervention to measure tangible learning outcomes brought about enough support in the program to try a
second approach.
A second intervention took place shortly after the pre-and post testing to test for learning outcomes: a
language portfolio requirement. The results were intended to assess the effectiveness of the program and
give language faculty feedback on where and how they can further improve the program. Students were to
play an active role in assessing the improvement of their language skills and had the opportunity to
continuously learn about their strengths and weaknesses throughout their language study. Students could
also present the portfolio to potential employers as a tangible deliverable showcasing their language skills
The emphasis of the portfolio was using qualitative methods to assess the language level, in addition to
quantitative. From 2001 to around 2003, guidelines were discussed of what to include in the portfolio and
how best to assess learning outcomes. The portfolio consisted of samples of a student’s work (both oral
via video footage of presentations and discussion, as well as written assignments, papers and exams) from
the first to the last semester. It also includes a student’s self-assessment of his or her own work, detailed
where they had improved and by how much. The portfolio was piloted for one semester before it was
added to the school’s portfolio as a degree requirement to reduce labor costs and streamline the
requirements. In 2005/2006, the school dropped the portfolio requirement bringing the language portfolio
to an abrupt end. The effects the portfolio had on learning outcomes as well as how it successfully
measured them was not analyzed due to its short pilot phase. Nevertheless, in 2009 a sample of students
in almost every language program who completed the language portfolio were interviewed to learn more
about the potential positive impacts of the portfolios on the program.
8
Analysis
In order to analyze School X’s Language for the Degree Programs’ current evaluation practices, I seek to
answer the following questions:
1.) To what extent is School X’s Language for the Degree Programs using appropriate
evaluation processes to obtain meaningful information that can help improve program
performance?
2.) To what extent are the programs’ findings conducive to improving program functions?
To answer these questions objectively, it is important to first consider best practices of language program
evaluation in the literature review as well as helpful program evaluation tools in order to analyze the
program’s evaluation practices.
Literature Review
There has been much discussion in the literature about the importance of evaluation in language
programs. U.S. regional accreditation bodies call upon “institutions, departments and faculty to gather
evidence that students are really learning what they are supposed to learn-as determined by the programs
and institutions themselves-and to utilize that evidence internally for making decisions and revising
program practices” (e.g., The Higher Learning Commission, 2003). The Association of Departments of
Foreign Languages offers guidelines that encourage evaluation of language programs to include multiple
methods of assessment and measurement of learning outcomes instead of “often the only measure of a
teacher’s effectiveness: the student survey at the end of the term.”13 Assessing and measuring learning
outcomes is an essential component of a program’s overall program evaluation process and practices.
Assessment, according to John Norris of the University of Hawaii, “is the systematic gathering of
information about student learning in support of teaching and learning. It may be direct, objective or
subjective, formal or informal, standardized or idiosyncratic, but it should always provide locally useful
information on learners and on learning to those individuals responsible for doing something about it”.14
Therefore, assessment plays a central and vital role in the functioning of a program, as it supports the
building blocks of a program’s work towards achieving student learning outcomes. Barbara Wright of the
Western Association of Schools and Colleges considers assessment “an internal process of inquiry and
improvement that, based on evidence, provides answers to specific questions about student learning
arising from specific institutional goals.”15 She uses the terms “evidence” or “documentation”, instead of
“measurement” to expand assessment options for more meaningful learning outcomes. In Figure 1 below
there is an assessment loop depicting the steps of evaluation which include the often left out and crucial
steps of interpretation of the evidence and use.
Figure 1: The Assessment Loop16
John Norris also emphasizes these last two phases of the loop. He addresses the problem of the actual act
13
Association of Departments of Foreign Languages “ADFL Guideline on the Administration of Foreign Language
Departments: Evaluating and Encouraging Good Teaching.” Adapted from a statement prepared by the ADE Ad Hoc Committee
on Changes in the Profession: Teaching and Research; adopted by ADFL 1993; reaffirmed by the ADFL Executive Committee in
2001. Modern Language Association http://www.adfl.org
14
Norris, John M. The Why (and How) of Assessing Student Learning outcomes in College Foreign Language Programs.” The
Modern Language Journal 90 (2006). 579. The University of Hawai’i
15
The Modern Language Journal. Perspectives. Volume 90, Issue 4. 575 Winter 2006
16
Wright, Barbara. Learning Languages and the Language of Learning The Modern Language Journal
9
of assessment, the “doing,” before looking at how it will be used. Assessment in and of itself is not
enough. In asking the question: “what does it mean to understand, improve, and demonstrate the
value of our programs?”17, the actual meaning of assessment lies with its actual contribution towards the
enhancement of programs. Expanding understanding and knowledge about assessment as well as defining
roles for those who participate it is the only way to ensure it will be most useful and sustainable. The
Association of Departments of Foreign Languages also stresses the importance of participation in the
evaluation process and recommend chairs and faculty take a lead role in the design and implementation of
the assessment, so that their results genuinely reflect the program. Student views on programs using
questionnaires and meetings should also be considered and students should be involved in the
administration of questionnaire.18 Geoffrey Chase of San Diego State University adds “a focus and
reorientation of roles and relationships directed at meaningful student learning to cultivate a culture of
learning”19 is also needed to truly internalize evaluation as a value within a community, institution or
program. Michael Morris specifies the role of the administrator to ensure an environment where learning
is the primary goal of the organization.20 If highly valued, evaluation presents tremendous opportunity for
programs to demonstrate their ability to managing a tradition of quality and innovation for good public
relations according to Richard Kiely of the Centre for Research into Language and Education at the
University of Bristol, England. This merits high attention during hard economic times, as holding higher
education institutions accountable to certain standards will ensure consumers get what they pay for.21
How and which language learning outcomes are measured within an evaluation can depend upon the
values of the organization, the users of the information, and the form of assessment or data collection
instruments, techniques and analysis available and/or possible. Measuring these outcomes effectively,
however, can be quite difficult.
Measuring language-learning outcomes: best practices
A look at best practices of how language-learning outcomes are measured effectively can be extremely
helpful in an evaluation. Language learning such as the improvement of language
and analytical skills (specifically oral communication and proficiency, culture and geography, reading
including comprehension and analysis of both literary and nonliterary texts, and written expression) is
best documented by using a set of indicators within different forms of assessment. Indicators should
contain quality, quantity, and time-bound or time and place (QQTP) measures to be able to recognize
precise changes (in success or progress or example) and to be objectively verifiable. Quality can be
expressed by the value of something measured (absence of, degree of); Quantity represents a statistical
measure of something (percent of, frequency of, number of); Time
indicates the start and end point between which an indicator is
measured; Place is the specified location the change occurs at.
In Figure 2 to the left, indicators are most effective in
measuring outcomes if they establish a baseline (to
determine the situation at the beginning of the
planning period), set a target (commitment) and
Volume 90, Issue 4 594, Winter 2006

17
Norris, John M. The Why (and How) of Assessing Student Learning outcomes in College Foreign Language Programs.” The
Modern Language Journal 90 (2006). 577. The University of Hawai’i
18
Kiely, Richard. Evaluation, Innovation, and Ownership in Language Programs. The Modern Language Journal 90 (2006). 598.
Centre for Research into Language and Education (CREOLE) University of Bristol, United Kingdom.
19
Chase, Geoffrey. Focusing on Learning: Reframing Our Roles. The Modern Language Journal 90 (2006) 584. San Diego State
University
20
Morris, Michael. Addressing the Challenges of Program Evaluation: One Department’s Experience After Two Years. The
Modern Language Journal 90 (2006). Northern Illinois University.
21
Dickeson, R. (2006) The need for accreditation reform. Washington, DC: The Secretary of Education’s Commission on the
Future of Higher Education.
10
measure achievement (the actual result).22 Performance of a program in relation
to a given outcome can be effectively measured. Indicators can be classified in four
groups: process indicators, (interpersonal communication/transactions) input indicators
(things that were supposed to be provided at a certain time), outcome indicators
(signify an observable change in the human condition) and impact indicators
(signify the highest change in the human condition that demonstrates program
effectiveness). Data collection instruments (surveys, self-assessment, interviews,
focus groups, case studies, expert opinions and panels) and how they are applied,
as techniques should embody the indicators selected and fit the needs of the target
audience. Knowing the culture and environment in which the data is collected is
important to ensure that the techniques remain reliable and applicable. Learning
outcomes can be measured both quantitatively and qualitatively through various indirect (questionnaires,
focus group activities) and direct (oral proficiency interview, portfolio, standardized language exam or
language achievement test) instruments. Exit exams are noted as an ideal way to measure the level of
language mastery and can include re-taking the placement exam, a final portfolio as well as open-ended
survey. Questionnaires should include comments on content structure, and aims as well as comments on
teaching style.23
The Student Learning Outcome (SLO) assessment (Stiehl/Lewchuk24) assesses educational effectiveness
and can be used in developing a comprehensive survey that gets at real learning outcomes. It includes a
clear statement of what students will be able to do outside the classroom as a result of what they have
learned.
Figure 3: Student Learning Outcome Assessment

SLO Criteria Yes No
Begins with active verb specifying what the student will be able to do
Focuses on student performance, not teacher performance
Represents abilities that can be transferred outside classroom
Can be measured
The literature highlights the need for self-assessments to evaluate students’ perceptions of their learning,
and portfolios to assess development of writing skills.25 This assessment is highly cost-effective and gives
students ownership over their own learning development as well as equips them with an important skill
for their professional future. Kathi Bailey, President of the International Research Foundation for English
Language Education (TIRF) and professor of Applied Linguistics at the Monterey Institute of
International Studies, calls for more attention on student perceptions of language learning, as it offers
valuable insight as to how a student perceives his/her progress in a language. This can influence learning
outcomes such as students’ confidence or stress-level, allowing for the student to determine the language
ability that he or she is capable at the end of the program. Portfolios can either take the form of progress
portfolios or portfolios that contain a student’s best work. The overall purpose of a portfolio, for example,
might be to assess how well department prepares its graduates instead of measuring a students’ level of
proficiency. The goals of the portfolios can demonstrate achievement in one or more areas and reflect
growth in subject matter. Students can be given the opportunity to submit only what they consider to be a
reflection of their growth in the language program.
22
Powerpoint 5 of Beryl Levinger’s Program Evaluation class
23
Kiely, Richard. Evaluation, Innovation, and Ownership in Language Programs. The Modern Language Journal 90 (2006). 598.
Centre for Research into Language and Education (CREOLE) University of Bristol, United Kingdom.
24
http://www.leeward.hawaii.edu/assessment-slo
25
Liskin-Gasparro, J. (1995). Practical approaches to outcomes assessment: The undergraduate major in foreign languages and
literatures. ADFL Bulletin, 26. 21-27.
11
ACTFL Proficiency Guidelines and the Oral Proficiency Interview (OPI) established a benchmark for the
assessment of foreign language proficiency, based on performance assessment instead of the
measurement of knowledge. OPI measures language production as a whole by determining the level of
proficiency attainted by an individual without regard to method used to learn the language26. The test is an
interview conducted in various forms that may include role-play. It is carried out by a trained proficiency
examiner who checks for consistency of proficiency using three levels of questions ranging from warm-
up to challenging to a cool-down.
Program Logic
A solid understanding of a program’s logic constitutes a very important best practice of evaluation.
Program’s theory, the logic of what activities and outputs or interventions lead to intended learning
outcomes and impacts (illustrated by the black box in figure 4 below), can inform an evaluator whether a
program’s poor impact is due to a theory failure or implementation failure. Figure 4 takes a systems
approach to the process, illustrating the role the environment plays either hindering or aiding the
conversion of program inputs to outcomes.
Figure 4: Systems Model
Impact theory can inform an evaluator of expected language learning outcomes the program intends to
achieve and how exactly how these are measured so that the extent to which the program is adequately
meeting the needs of its target group and whether the program is having the desired impact can be
evaluated. This impact evaluation conducted at the end of a program’s cycle should determine the level of
change in behavior-an actual change in the human condition-that has come about as a direct result of the
program, justifying the program’s purpose. Carefully chosen indicators document this change and can
measure a program’s effectiveness. The strength of these linkages will indicate whether or not the
program is operating as intended and if it is in line all the way up to its overall goal.
The Logical Framework Matrix
The logical framework matrix, or logframe, is an effective evaluation tool that provides the foundation for
program logic analysis (See Figure 5). In examining the logic of the Language for the Degree Program’s
program design, the logframe matrix answers the question about the change in the human condition the
program aspires to bring about; in other words, the program’s purpose. Analyzing and testing a program’s
theory, the matrix is a useful tool that can provide stakeholders with a simple picture of the logic
supporting the program’s implementation and essential components.
Divided into four rows of “aims,” (activities outputs purpose goal), the logical
framework matrix is arranged as a hierarchy where each “aim” represents a level of project logic that
serves as the basis for and builds a relationship to the next level. The columns are applied to each level of
the logical framework matrix. At the lowest level a project’s inputs attributed to its fundamental activities,
main project components and services such as education through which outputs are achieved. Thus,
26
Liskin-Gasparro, J. (1995). Practical approaches to outcomes assessment: The undergraduate major in foreign languages and
literatures. ADFL Bulletin, 26. 21-27.
12
above this level is the output level where deliverables or products resulting from these inputs are
identified. These typically consist of networks and/or services or programs and help achieve the overall
purpose. At the purpose level, an essential question must be answered, “What is the major key change
related to the goal that this project will achieve at its conclusion?”27 Most importantly, this change must
represent a benefit in the human condition as a result of the program’s activities and outputs. This change
represents the results from the program in the short-term and thus produces tangible, low-level impacts.
At the last and highest level, an evaluator makes sense of all of the aims, determining how they fit
together to amount to the goal of the program. The goal level is defined in terms of quality of life
improvement and is a bottom line condition of well being of individuals, families, or communities. This is
usually a long-term result that is difficult to measure, but involves bigger changes leading to greater
impacts. By itself, a small program is not solely responsible for achieving the goal but is part of
something greater, combining forces with other programs and projects to serve an even higher purpose.
The second column of different and specific objectively verifiable indicators (OVI) formulated according
to the QQTP principle, measuring the extent to which an objective has been achieved such as the extent
the conversion of outputs to outcomes contributes to the achievement of a program’s purpose and goal.
Indicators measure the percent change in what needs to be measured. The third column contains the
means of verification (MOV) or how data is collected from a reliable and verifiable source. Key questions
here often ask who will pay for gathering the data, who will be involved and how much is worthwhile.
Assumptions or conditions challenging the principle of ceteris paribus or all things being equal are
included in the last column of the matrix to account for uncontrollable, external conditions affecting the
effectiveness of the program.
Keeping these best practices in mind and using information from the interviews as well as on the website,
I completed a logframe for the Language for the Degree Programs, determining what I believe to be its
activities, outputs, purpose and goal. I did not determine actual percentages for the indicators, and instead,
leave this up to the Language for the Degree Programs to decide so that the objectives are set from within
the organization. I have only listed what I consider to be necessary indicators in order to measure whether
or not the program is effective at each level, as the more indicators the program has, the more time and
money will need to be spent collecting the data. The activities are clustered into the programs’ main
services: providing authentic language articles, materials and education with a content-area focus, the
arrangement of classrooms and accommodation conducive to discussion and collaborative learning,
providing an online diagnostic test for student self-assessment, as well as designing, implementing and
assessing placement test for each language to place students in appropriate language levels. Receipts,
physical inventories, and financial budget to track expenditures, as well as observations, teacher
portfolios and placement tests can verify these activities. The necessary budgeted items or ‘inputs’ to run
these activities are also listed. Under the assumptions, absences of staff & students due to transportation,
personal or other reasons could negatively affect the inputs/activities, thus lowering the success of the
outputs at the next level. These should be kept at a minimum, but cannot be controlled.
These activities transfer into outputs, which include a successful language program containing
language/culture classes and the Model courses with positive environments conducive to learning. The
most essential indicator to measure the learning environment is the percent change of student satisfaction
levels with the course’s learning environment at the end of the semester compared with previous
semesters. I will elaborate in the recommendations section on possible survey questions that could make
up this indicator. Knowing the percent of people served in each language program and the percent of total
students served by the Language for the Degree Program is helpful information for monitoring purposes
and does not tell us anything about whether or not the program is effective. Using percentages is the best
way to measure change that takes place from year to year. The output listed is possible, if the financial
support for the program continues.
27
Notes from powerpoint eval8-09, Beryl Levinger’s Program Evaluation class
13
The outputs support and generate outcomes to fulfill the purpose of the program. Competence in language
and analytical skills as well as cultural knowledge alone, is not enough. These outcomes must be means to
a particular end, which signifies measurable benefits in terms of a behavioral or status change: to reach
global career opportunities.28 I identified the most critical indicators for measuring competence in
language and analytical skills and cultural knowledge. These indicators are measuring a low level change
in the human condition brought about by the program, and therefore do not measure the program’s
performance in a given semester. A pre-post test that uses standard scores would measure the percent
change in before and after scores of a student from the initial placement exam that includes a written and
oral proficiency test upon completion of the program. The tests would take a student’s language
proficiency into account and standardize it so that a student who starts out with a high level of language
ability and whose test scores do not increase significantly compared with someone who starts at a lower
level and has a greater percent increase in their scores, can be compared. The percent increase in the exit
survey score at the end of the language study compared to previous years is useful in determining the
program’s overall performance of delivering these outcomes to students consistently year after year. In
my recommendations section, I will present a few different questions along with another instrument, the
portfolio, which could make up the score as an indicator on the exit survey. The percent of School X’s
alumni hired based on their language and analytic skills within X years after graduation can be measured
by looking at job placements of alumni. Acquisition of a second language must be valued for the program
to achieve its purpose before moving on to the goal level.
Although the stated goal of the program is to have language integrated into the core curricula of the
graduate degree programs, I have altered it to reflect a real change in the improvement of quality of life
for the students. If the language is integrated into the core curricula, then what change in the human
condition will be brought about? By looking at the overall goal of the graduate programs, School X’s
Language for the Degree Programs contribute to this higher objective of having the school’s graduates
become leaders capable of bridging cultural, organizational, and language divides to produce sustainable,
equitable solutions to a variety of global challenges. In order to assess this larger change, the program
could look at having an indicator that measures the percent change of the school’s graduates who feel
they are making a meaningful impact in their field each year compared to previous years. Questions from
the alumni survey as well as alumni job descriptions and the school’s stories from students and alumni on
the website could be the means with which to get information for this indicator. If the school’s programs
continue to be valued, this goal is possible.
28
global career opportunities: a series of experiences, events and actions, characterized by mutual dependency between an
individual and an organization in global interaction that takes an individual’s work passion and talent, career goals and interests
into account.
14
Figure 5: The Logical Framework Matrix for School X’s Language for the Degree Programs
Narrative Summary Objectively Verifiable Indicators (OVI) Means of Critical Assumptions

Verification (MOV)
Goal
-alumni survey -programs of School X remain valued
Graduates become leaders -% change of School X’s graduates who feel they are making a
capable of bridging cultural, meaningful impact in their field each year compared to previous -job descriptions
organizational, and language years
divides to make meaningful -online stories from
impacts in their fields. students and Alumni
Purpose -2nd language acquisition remains valued

-% change in a student’s before and after score from the initial -pre-post test (including
Competence in language and placement exam (written and oral proficiency test) upon written and oral
analytical skills and cultural completion of the program proficiency interview
knowledge to reach global career (OPI)
opportunities -% increase in the exit survey score at the end of the language
study compared to previous years -exit survey
-% change of School X alumni hired after X years based on -alumni survey

their language and analytical skills and cultural knowledge
compared to previous years
Outputs
-% change of student satisfaction levels with the course and -survey -financial support for the program
Successful language programs learning environment at the end of the semester compared to continues
-language classes, positive previous semesters
learning environment
Activities (Inputs) Inputs as budgeted items:

-provide authentic language -equipment, technology
articles, materials and education materials and supplies, qualified staff/faculty, facilities -receipts -absence due to personal, transportation
with a content-area focus that or other reasons of staff & students are
stimulates feedback (teacher- -physical inventories kept at a minimum
student, student-teacher, and
student-student) -financial budget to track
-arrange classrooms and expenditures
accommodation conducive to
discussion and collaborative -observations
learning
-provide an online diagnostic test -teacher portfolio
for student self-assessment
-design, implement and assess -placement tests
placement test for each language
and post-test at the end of the
language program
16
Levinger Evaluation Design Matrix
Another evaluation design tool that can help an organization assess its current evaluation practices is the
Levinger Evaluation Design Matrix. The Levinger Matrix allows one to fill in and explain in detail how
to evaluate each goal of the program design. The components include a key question known as a meta
question, terms to be operationalized defined (with definition), key measurable indicators, data collection
instruments and techniques, data analysis techniques, and other (e.g., users, reviews, collectors, analysts,
decisions). As a best practice, learning institutions such as School X’s Language for the Degree Programs
at MIIS should measure language learning outcomes in order to know when and to what extent the
program actually attains these outcomes and how to enhance them. The matrix helps the evaluator
ultimately determine to what extent a program is or is not functioning and to what extent it can be
strengthened. I filled out a Levinger Evaluation Matrix below in Figure 6 with the Language for the
Degree Programs current evaluation practices related specifically to a key question on how the they
measure language-learning outcomes. This evaluation question focuses on the area “of program
performance at issue” and facilitates “the design of a data collection procedure that will provide
meaningful information about the area of performance”29 for stakeholders and decision-makers. This
would be part of a summative evaluation that takes place at the end of a program’s life and seeks to
determine short-term changes in the participants’ lives. I operationalized the main term, as I understand it
to be in relation to the Program’s purpose. I referred to the organizations current evaluation practices to
fill out the rest of the matrix, including: key measurable indicators; the data collection instruments; data
collection techniques; data analysis techniques; data collection frequency, timing, and responsible
persons; and other (users, decisions, analysts etc.) I will mention each set of data used in the matrix in my
analysis of the program’s current practices.
Figure 6: Levinger Evaluation Design Matrix, Current Evaluation Practices
Key question: Terms to be operationally Key measureable Data collection

defined (with definition): indicators: Instruments:
To what extent does the Language-learning outcomes: Average survey scores for -student evaluation
Language for the Degree improved language and analytical each class measured on a surveys
Programs measure language- skills (spoken, written, listening, Likert scale of 1-5 at the
learning outcomes for its reading), knowledge/content based end of each semester -teacher’s gradebook
students? language learning that students
have attained as a result of their -letter grades A-F at the -mid-term evaluation
involvement in a particular set of end of each semester (some teachers)
educational experiences in the
program. -overall average survey -meetings with students
scores compared with
previous years -alumni survey
-# of written and oral

complaints in a given
semester
-# of students who drop out

in a given semester
29
Peter H. Rossi et al., Evaluation A Systematic Approach (California: Sage Publications, Inc., 2004), 97.
Data collection techniques (including frequency, Data analysis Other (e.g., users,
timing and Responsible persons) techniques: reviewers,collectors,
analysts, decisions:
-teachers share link to an online survey with students at the end of -disaggregated into what -Language teachers
each semester to be completed within two weeks after class ends students liked and didn’t decide the language level
-teachers grade students based on papers, tests, presentations like (positive vs. negative) placement of each student
-some teachers create and distribute their own mid-term surveys for open-response -Language teachers
once half way through the semester to each student by hand, collect -disaggregated using decide how incorporate
and analyze them. frequencies of 1-5 feedback results to their
-ongoing feedback/complaints are recorded by the Dean and -online survey generates class
Language Program Chair via written document at the time they analysis -Dean and language
occur -standardize teachers review survey
-Alumni Relations distributes and collects online one-year out survey -statistical tests to find results for each class;
three times a year; no particular person in the Language in the average scores; median Dean gives feedback to
Degree programs is responsible for collecting the results and standard deviation language teachers
-rubrics -All language faculty would
-scores (% right and like results of alumni
wrong) survey for each individual
-class average language program
Analysis of Current Evaluation Practices
Compared with the afore mentioned best practices applied in the logical framework in Figure 5, the
Language in the Degree Programs current evaluation practices discussed in the first part of the paper and
outlined in the Levinger Evaluation Matrix do not provide optimal information to improve program
performance. Upon review of the Levinger Evaluation Matrix in Figure 6 above, student learning
outcomes are not measured effectively enough by the current indicators, data collection instruments,
techniques and analysis and by those who use and review the information. The Program does not have
specific objectives as well as base line indicators with which it can measure its performance against.
Furthermore, the primary data collection tool, the survey, is highly criticized by faculty and students alike
as to how useful it is in collecting the necessary information to improve program performance. I will
elaborate on the program’s most frequently used evaluation tools first, starting with the survey and
continuing with the alumni survey and the placement exam and ending with the most infrequent
monitoring evaluation. School X’s Language for the Degree Programs is, however, aware that language
learning is not measured effectively enough and has taken action to get at these outcomes through two
interventions which I will analyze later in this section: pre-and post testing and the creation of the
Portfolio requirement.
Survey
Since School X’s Language for the Degree Programs’ current evaluation methods rely on the survey to
provide the most meaningful and objective information, it is important to look at this first and foremost to
see to what degree it is helpful in improving the program. In general, there are several criticisms from
language faculty and staff as to why the survey is insufficient in collecting the necessary information
about the program’s performance. The faculties are interested in changing the evaluation procedures for
classes because they are too generic and not tuned in to adequately tell them more about the impacts of
their teaching. They also feel that certain components of the survey should be omitted because they feel
students are unqualified to answer the questions. The current survey only measures the outputs of the
18
program, instead of getting at a higher-level change in the human condition associated with actual
language learning outcomes. The number of those involved in the evaluation of the survey and decision-
making process is limited, its results viewed only by the Dean and specific teachers for each class. All of
these criticisms have resulted in the overall opinion that the survey needs to be improved.
The survey itself in Appendix A reflects very little input from the faculty themselves, who are amongst
the main users of the information and depend on it to inform their decisions, as well as by the students
who actually fill out the survey. The Program assumes that faculties are eager to see and interpret the
results of the surveys so that they may know what is working and what needs improvement. Some adjunct
language faculties do not know what questions the survey asks and do not always find the information
helpful; therefore they do not use the information to make changes to their curriculum. The survey
contains mainly descriptive and normative questions that are “weak for impact questions unless they are
supplemented with serious qualitative causal analysis30, which they are not. A teacher commented that
students who write extra comments at the end of the survey keep repeating the answers they already gave,
which means that the actual questions are too generic or not thought provoking enough. The survey does
not allow for faculty to ask specific questions to elicit student feedback to make changes to their
curriculum that meet the needs of their students better. As it is administered online only at the end of the
semester, the response rate is less than ideal and faculty must wait until the following semester to
incorporate feedback and make changes for next year’s class. As a result of these drawbacks, some of the
language faculties have had to design their own paper survey or midterm evaluation in addition to the end
of semester survey. This approach requires a teacher to be extremely open and responsive to the feedback.
Looking at the survey, faculties recommended omitting the category, Instructors Performance, all together
for a few reasons. Students can gain a sense of empowerment from the survey and can use it to let out all
of their frustrations about the professor or the class from the semester. One professor commented that he
felt as if he was on Ratemyprofessor.com, where students fail to rate a professor’s performance
objectively. Some of the faculties do not believe students are capable of judging a teacher’s competence
in the subject matter. They felt that a qualified expert such as a peer or other language faculty using a set
of standards or objectives is the best judge of an instructor’s performance including all of its components
listed in the survey. Currently, the faculties see the instructor as being evaluated twice in the survey-once
in overall instructor performance and once in overall rating of the professor. They are not convinced that
students perceive these two areas as being different or that they are different, which affects the average
score of the evaluation. The faculties also felt that a peer, other language faculty or other observer could
comment on the respect for students, responsiveness to questions under the category, The Instructors
Relationship With Student. Facilities did not feel that students could accurately assess the availability
during office hours and appointments as some students do not take notice of set office hours or bother to
make appoints and assume the teacher will be available. Faculties also felt that the category, My
Contribution To My Learning, was unclear. The category included various inputs outside of the program
that influences student performance and/or experience in the program. Since the language courses are
mandatory and level of difficulty has been established according to the placement exam, looking at the
initial interest in subject matter and difficulty level of the class as inputs is not a good measure of a
student’s contribution to his or her learning. They are actually good measures of to what extent the
program is delivering the appropriate program to meet the needs of the target group. A student’s
contribution to class and work outside the class is extremely vague, making it impossible to determine by
the results of the survey. These are also not necessarily evaluated objectively by the student
himself/herself. There remains the question of what the purpose of this category is for the program and
how it seeks to use the information to increase program performance, for example, mandating a certain
level of personal contribution in order for students to achieve optimum results from the program.
Course organization, instructor performance, instructor relationship with student, overall rating of the
30
19
professor, and the grade a student anticipates are indicators that the program uses to measure the
satisfaction levels of the students with the program. These indicators are only associated with a program’s
outputs, the actual course and learning environment, and do not measure the actual language learning
outcomes that result from them. The survey questions could be improved to better reflect the values of
MIIS and to provide more meaningful insight into how students feel about their learning environment.
Currently, all of the questions in the survey are measured using a Likert Scale with frequencies 1-5, one
being poor and 5 being excellent, which faculties perceive as confusing to international students who
might interpret 3 as average according to European standards for example, compared to the U.S. who
interpret 3 as merely fair. This difference in perception might influence the way that students fill out the
entire survey affecting its results.
The current questions in the survey do not meet the requirements set forth under the SLO assessment,
designed to get at a program’s outcomes. Under the category of Outcomes, it is unclear exactly what
outcomes have been achieved through the program. Opportunity to practice and use material does not
indicate that any skills have been improved and if a change in the human condition through the
acquisition of these skills has come about, for example. Although the deepening mastery of subject matter
is an improvement of skills, it is still too vague, as it does not specifically mention if it is measuring the
actual language and analytical skills or cultural knowledge. It would be difficult to see, for example, the
effects of a new pedagogy on the class or program because its effects are not measured. One would have
to look at the comments from students about what did or did not work in the class. It is also difficult to
discern exactly how the different objectives the teachers create based on student needs are measured
given the current indicator. Different language programs are faced with different problems and student
needs they try to address. For example, faculty who creatively meet the needs of students in larger classes
by breaking topics down by interest level might want to measure the extent to which their approach was
successful in increasing the learning outcomes for the students. This could show up under the section of
what one liked or didn’t like about the course/additional comments if the professor specifically informed
or hinted at the students to do so.
The current practice of using the overall average survey frequency as a number as well as its median and
standard deviation compared with those from the previous year cannot accurately indicate whether or not
the program is successful in improving language and analytical skills and cultural knowledge to reach
global career opportunities at the purpose level because the survey focuses on measuring the program’s
outputs, not actual outcomes. The average frequency is a number and not a percentage, limiting the way it
can be easily understood in terms of changes from year to year.
The Dean as well as the Language Chair is aware of the fact that most student meetings revolve around
the number of incidences and prevalence of complaints about the program by the students. As a number
and not a percentage, the way in which the complaints can be understood in terms of changes from
semester to semester or year to year is limited. There is no current system for students to give anonymous
feedback, which could lead to students using or abusing the survey as the means for expressing their
troubles with the program instead of objectively evaluating how the program met their needs in terms of
language learning outcomes. Only the Dean is given access to the overall survey and views it as the best
indicator of actual program performance. The Program Chair, who remains strategically located between
the advisors and the Dean and is the first person of contact for students as well as professors, hears a
majority of the complaints as well as positive feedback, but yet lacks an evaluative role in the Program.
Thus, valuable input and information from the Program Chair is left out in the current evaluation
practices.
Alumni Survey
The alumni survey (see Appendix B) collects two questions regarding language study and is an
instrument available to the Language for the Degree Programs that can effectively measure the extent to
which their language learning outcomes enable students to reach across cultural and linguistic barriers
20
towards more global career opportunities. The alumni survey collects information about the impact of all
of the language programs at School X. Of the finding from the survey, only a small percentage actually
use their language in their current employment in comparison to a larger percent who see it as valuable in
getting the job they have today. In terms of whether or not language was the main reason they came to the
Institute, it really depended upon their expectations of School X. Although the information from the
alumni survey can be segregated to show responses from only alumni of the Language in the Degree
program, it is currently not segregated due to financial and/or time constraints. Many of the language
faculty and staff are interested in this data but no one is responsible for disseminating it into meaningful
information for each language program. Thus this data cannot be used to determine how the program
could be best improved during the decision-making process.
Placement Test
An initial placement test (written and oral proficiency test) helps the language teachers decide the
language level and placement of each student in the Language for the Degree Programs. Teachers for
each language collect and grade a written placement test and assess oral proficiency only once at the start
of a students’ first semester. Scores from the written test are determined by the percent or number of
incorrect vs. correct and compared with the average score and standard deviation for the group. It is
unclear to what extent Oral Proficiency Interview guidelines are followed and standardized across of the
language programs to determine the actual language proficiency of a student. The current practice is,
however, inconsistent. According to students, the oral proficiency test for each language varied in terms
of how it was implemented. One student recalls a teacher showing a short clip of a current event and
asking the student to explain what was going on and to answer specific questions about it. The student
also had to answers that tested his or her ability to use different tenses. Another student simply answered
what he or she did over the summer and before coming to School X. Currently, the same written and oral
placement exam is not administered to students upon the completion of their language program study.
Although this would clearly measure the actual change in learning outcomes or the improvement in
language, analytical skills and cultural knowledge and attribute them directly to the Program, the
inconsistency with the administration of the oral proficiency test would the placement test is implemented
effectively along with a post-test to follow it at the end of a students’ language study, the placement test
will remain highly ineffective evaluation practice.
Monitoring Evaluation
One current evaluation practice that has been effective in providing meaningful information on how to
increase the program’s performance is an infrequent monitoring evaluation, occurring once every several
years. The monitoring evaluation included a survey about language study experiences at School X (See
Appendix C). The results of the study in Appendix D confirmed what the Program already knew in terms
of its own strengths and weaknesses. As the survey was open-ended, the responses offered a rich picture
of the program and how it was or was not meeting the needs of the students. Because it was open-ended,
questions were left blank on the survey. Results from the survey showed that there was a problem with
the placement tests, as students moved up in language levels without increasing their proficiency as
compared to students who scored well on the placement tests in later semesters. The level of a student’s
language proved difficult to distribute in the various courses, as some languages that had a lower number
of students only had two placement options compared to other languages that had more options. It was
difficult to find a balance to meet student expectations and give them the most value out of the program,
with where students needed to be vs. where students wanted to be. Problems with Program coherence and
the importance of more permanent faculty were clear, but the Program still considers cost and time
constraints as major factors hindering changes.
The language staff was not in agreement with the purpose of the evaluation. One view was that the
evaluation was designed to only find out how students felt about the language courses in general and
specifically in terms of leading to future employment opportunities. Another view was that the evaluation
was also designed to find out if the language requirements offered at the school were meeting the needs of
21
the target group. The Arabic program, for example, is the only program offered at the 200-level and is
quite full. The survey seemed to indicate there was demand for other languages to be taught at lower
levels so that the school’s students could learn more languages or even different languages. Many
concerns came up over this, including how the make-up of the students who enter in at a 200-level could
affect the overall brand of the school and how offering them additional options could impact the functions
they are capable of in their future careers. As a result of the evaluation, another 200-level language pilot
program is now considered to determine if it is in fact a need of the target group. The exact date that this
program could come into effect has yet to be determined.
Past Interventions: Post-test and Portfolios

Measuring language learning outcomes had been a priority of School X’s Language for the Degree
Programs in the past, with two past interventions: pre-and post-testing and language portfolios. These
served as mechanisms for obtaining information relating to learning outcomes. Both of these
interventions suffered as a result of design and implementation failure. The design did not include input
from the target group as well as main users of the information. The implementation of both interventions
was not standardized and/or controlled and therefore ineffective in measuring program performance.
Other factors serving as barriers to effective implementations include the lack of political will or
champion for the system to continue with and improve either intervention or lack of prior experience.31
Post-test
The first intervention of giving students the same test as the placement test pre-language study at School
X at the end of their last semester of the language program proved unsuccessful for a variety of reasons. It
was difficult for the program to attribute the increase in language learning outcomes as a direct result of
the program itself given the flexibility of the program. The semester schedule varied amongst students;
some students took a semester off for internships and either did not work with the language for a full
semester or they might have worked intensively in language while abroad studying or during an
internship. Students who entered the program extremely fluent in their language because they recently
came from that country might actually perceive a drop in their fluency while in the program because the
environment in which they learn is different and not as immersive. Other students who are not fluent in
their language might experience greater gains in all areas. Given the different needs and expectations of
the students, the perceived gain in terms different language skills from the program also varied greatly.
Actual increases in their language skills with respect to their field of study might not be perceived at all
by the student and is difficult to measure with the standard test. These different experiences could either
positively or negatively affect students’ language abilities in the program that would enable them to reach
across cultural and linguistic barriers towards more global career opportunities. The evaluation design for
the pre-and post-test did not “rule out other feasible explanations for the observed results in order to
conclude that the intervention had an impact.”32 The design did not use standard scores as a means of
making the scores from all of the students comparable, despite the differences in their language abilities
and growth within the program. It also lacked a necessary control group of select students from the
population who met the criteria with the least amount of variables who could be tested to see how
effective the program was. T-tests were not performed and the data was not normalized in order to rule
out possible outliers in the data and to generalize conclusions about the population. The program also had
difficulties coming up with a proficiency test that assessed language skills and ability, cultural knowledge
and language skills in the context of different degree fields. Since the current placement test includes an
oral proficiency that is not standardized using OPI and its implementation inconsistent, the results from
the post-test were not reliable enough to draw conclusions about the program’s effectiveness. As the
results from the first pre-post test proved insignificant for determining the program’s effectiveness, post-
testing immediately stopped and has not implemented since.
31
The World Bank Group, “Module 11. Building a Performance-Based Monitoring and Evaluation System.” International
Program for Development Evaluation Training. 8 (1994): 8
32
22
Portfolios:
The two different portfolios were designed for both Western and non-Western languages in order to
account for learning differences. (See Appendix E for drafts of the portfolio requirements for non-
Western languages). With a portfolio requirement, students could play an active role in assessing the
improvement of their language skills and had the opportunity to continuously learn about their strengths
and weaknesses throughout their language study. It offered students a way to reflect on language learning
and to demonstrate their ability to use the target language and/or operate professionally in it, to self-
critique and to become autonomous learners. The emphasis of the portfolio was using qualitative methods
to assess the language level, instead of quantitative. Although the intentions of the portfolio were good,
the results were mixed. Language faculty as well as students did not appear to have much input, if any, in
the design of the portfolio requirements. They criticized the portfolio for having requirements that do not
match up with course objectives or deliverables. For example, a class might not require a fifteen-page
paper or have time for twenty students to film an individual presentation as the language portfolio
requirement stipulates. The language faculty did not seem aware of the time and effort needed for the
portfolio. There was great difference in the understanding of the role each teacher in the implementation
of the portfolio, feedback to each student and its overall assessment. Initially many students resisted the
portfolio, as they perceived it as interrupting to their language learning. Comments from students who
completed the portfolio included: teachers taking the portfolio very seriously and holding students to firm
deadlines and high standards when it came to grading, teachers uninterested in the portfolio motivation
offering little help and guidance throughout the process, teachers offering helpful feedback throughout the
process and/or upon assessment or offering no feedback at all. The different approaches affected students’
motivation and attitudes regarding the portfolio. Upon its completion, students who had a positive
experience with the portfolio were glad they could actually track their progress and have tangible
outcomes such, as samples of their skills, to show prospective employers. Students also had a better
understanding of their improvement because of the program but could not fully express this answering the
survey questions in the current survey. Reactions to the portfolios were captured in a collection of video
interviews of students by the Language Chair from almost all of the language programs who had
completed a language portfolio in May of 2009. The interviews in addition and in contrast to the survey,
offered more detailed and balanced views on the benefits as well as drawbacks of the language program.
Students with a negative experience felt that they were not provided with the support needed to
effectively evaluate his or her own work, and that the portfolios were not evaluated properly or enough
feedback was given to make the portfolio valuable enough to use for prospective employers. Since the
portfolio requirement was implemented for only a short amount of time and discontinued, the program
could not know how effective it was at measuring language-learning outcomes. Although the program is
still concerned with questions on how and when to implement the language portfolio, the design of the
portfolio itself is questionable and needs improvement. Therefore, all stakeholders should agree upon the
design as well as the implementation and evaluation of the portfolio first before it is used to measure
program performance.
Improving program functions
The current evaluation practices analyzed above and described in Figure 6 of the Levinger Matrix remain
focused on the output level of Figure 5 of the Logical Framework Matrix for The Language in the Degree
Programs. Real language learning outcomes at the Purpose level and are not measured effectively and
therefore remain unknown to the Language in the Degree Programs. Therefore, the program cannot fully
determine to what extent it is successful and how to best improve the program. It is difficult to determine
if current practices that focus on addressing and meeting the needs of the target group at the output level
are effective enough, even if an average 4 or 5 satisfactory frequency score is obtained, because the
measurable change in the given social arena33 at the purpose level is not measured. Such indicators at the
purpose level are needed to set a baseline objective for a comparison between a given standard and actual
33
23
result to measure the performance of the program for each category. The Logical Framework Matrix for
School X’s Language for the Degree Programs in Figure 5 contains possible objective verifiable
indicators, means of verification and assumptions to get at outcomes and changes that result from the
program that I feel are the most important for measuring program performance. This matrix is meant to
incite discussion among the stakeholders in the Language in the Degree Programs, so that they use it as an
example of how they can better demonstrate the impact of their program using impact theory, and to add
indicators and other means of verification if it is politically and financially feasible. The logframe should
reflect the priorities as well as values of School X and Language for the Degree Programs.
Recommendations
Considering that School X’s Language for the Degree Programs do not currently use the optimal
evaluation processes needed to measure program performance, and that findings from current practices
are not sufficient enough to improve program functions, I recommend the following:
Increased participation in evaluation practices and processes
The stakeholders in the program, including staff, language faculty and students as well as all
users of the information from the evaluations should have a voice in deciding how to assess the program
so that the information obtained remains meaningful to all users and in alignment with organization and
program values. From defining evaluation questions to best measure language learning outcomes,
choosing the instruments and means of verification, creating indicators and setting objectives, the
program’s performance can be measured more effectively. If language faculty are included, for example,
more relevant information as to what they need to know in order to make changes to their curriculum and
approach will be collected, making the data more meaningful in improving program performance. This is
especially in regards to the survey, which many staff and language faculty want revised, but also for other
recommendations that are to follow. Increased participation could also help separate the data from the
Alumni survey into meaningful information to inform decisions. The Program Chair, specifically, should
have an evaluative role given that her role as facilitator and arbitrator, hearing and following up on
complaints and praises of the program for students and faculty, is crucial in improving the program’s
functions and performance. Although not everyone wants a voice in the process and more people
involved can slow it down, roles can be clarified in order to maximize efficiency and to ensure evaluative
methods are standardized. With increased participation, discussion and recommendations on important
past interventions, the portfolio and pre-and-post test that actually measure language learning outcomes,
can actually move forward on the agenda and inspire action. Most importantly, participation is a value of
School X; therefore it is imperative that the Language for the Degree Programs takes immediate action to
ensure it is reflected in its evaluative processes and practices.
Diversifying the Means of Verification
Instead of relying primarily on the survey alone to measure program performance (as this includes
indicators that measure student satisfaction instead of actual language outcomes), I recommend expanding
the means of collecting information to include pre-and post testing and an exit survey. Other “nice to
have” means of verification if resources and time permit, include portfolios, observation, and panels that
yield qualitative data can offer a better perspective of both positive and negative effects of the program.34
With these different means of verification, a more diverse set of indicators could be created to measure a
program’s outcomes and performance even further.
34
24
• Re-thinking the pre-post test intervention: The oral proficiency test as part of the pre-test
should be standardized so that student’s oral language abilities are assessed more
objectively. Adding questions to the pre-test as well as a post-test to the initial placement
exam can measure a before and after measurement of language learning outcomes to
gauge program performance. Questions such as: How would you rate your German
language ability right now in terms of proficiency: speaking, writing, reading; could be
added to the pre-test and asked again in a post-test or combined with the exit survey. The
tests must use standard scores so that student scores depict the changes brought about by
the program accurately. Instead of collecting data from the post-test from the entire
population of the target group, a random sampling or a criteria-based selection from the
population could be used in order to determine the success in language learning outcomes
that can be attributed to the program.
• Exit survey: An exit survey could yield accurate and meaningful information about a
student’s overall experience in the program, including a student’s summation of the
overall effect of the program on his or her language and analytical skills and cultural
knowledge. The results of the survey could be compared to previous years so that the
program can see how it is performing. Questions for this survey that can help measure the
overall student competency in language and analytical skills and cultural knowledge
could include the following: the degree to which a students’ understanding of another
country, culture and its customs were broadened, the extent to which a student can
communicate effectively in their foreign language, a student perception of progress of
his/her language abilities, and the level of student confidence presenting a topic in a
professional manner in the target language in front of his/her peers upon completion of
the program. Some questions for the survey should be open-ended so that the results are
enriched with complete feedback on the entire program that is not constrained in anyway.
Nice to Have:
• Re-thinking the Portfolio intervention: A student portfolio requirement has the potential
to gather important information about a student’s perspective on his or her progress in the
program they might not perceive otherwise through the use of self-assessment and
reflection as a data collection technique. In addition to tests, a portfolio can provide a
much accurate picture of their language level and learning outcomes and will help
students answer the questions on the exit survey. Although getting such a successful
evaluation tool like the Portfolio back on the agenda requires some momentum and
political will, the actual design and implementation of the portfolio was criticized.
Therefore, it should be re-evaluated and re-designed including feedback from all
stakeholders involved so that it is successful and sustainable.
• Observation: Language faculties believe in peer-to-peer review as the most appropriate
and effective way to evaluate and improve teacher performance instead of using a survey.
As of now, the language faculties’ schedules do not allow for this rich learning
opportunity.
• Panel: A panel design that consists of a smaller group of people who have their
experience in the program tracked and recorded in a great amount of detail could produce
substantial information on effects the program that is unknown to the program, either
intentional or unintentional.
Develop indicators using the QQTP
25
Each language-learning outcome should at least one indicator, and each indicator a separate target.
Numbers in the indicators should be expressed as percentages to indicate the change that occurring as a
result of the program, making the information more interpretable and meaningful. For example, an
objective an be set to achieve a 2% increase in student satisfaction levels with the course and learning
environment at the end of the semester compared to previous semesters. In order to achieve this, perhaps
some improvements are made to improve the learning environment. These improvements could include
an increase in the use of technology in the classroom. This information can be collected by a survey
question, one of many which contribute to the course and learning environment at School X.
Redesign the survey
I recommend that the survey be redesigned to include the three recommendations above, so that it
measures actual language learning outcomes of the program as well as the program’s outputs more
effectively. The Student Learning Outcome assessment in Figure 3 should be used to develop better
survey questions for a more comprehensive survey that can measure learning outcomes. Each question in
the survey should contain a clear statement of what students will be able to do outside the classroom as a
result of what they have learned, and must be measurable. As these focus on student performance, not
teacher performance, the Instructor Performance should be considered for removal if it can be evaluated
by the Dean or by an Instructor’s peers as it is assumed they are qualified to teach the subject matter by
the program. Negative student feedback on instructor performance already shows up in oral and written
complaints to the Dean and the Language Chair.
The indicator listed in the logframe is a must-have because it measures the learning environment. Other
indicators could include the percent of reported and written complaints in a given semester compared to
previous semesters. For the survey, instead of using questions that ask about the usefulness of tests and
quantity of assignments as well as course objectives related to goals with the Likert Scale, course
organization could include questions about the topics covered in the course according to the syllabus as
well as the authentic and/or current articles used during a given class. This information can then be
compared to a given standard set by the Program. Questions concerning the interaction in a given class,
including participation and discussion, between students, themselves and the teacher could indicate to
what extent the class was a collaborative learning environment. Questions concerning the frequency of
words of praise or encouragement in a given class as well as student motivation to apply for internships or
career opportunities in a country where their foreign language is spoken, could indicate to what degree the
environment is positive. The Program could incorporate innovation as a school value into questions
concerning course organization by determining first what innovation means for technology in the
classroom. They could then ask questions about the frequency to which various technologies are used in
the classroom. Although customizing language surveys for each language program will cost time, I
recommend that the survey include two questions designed by the instructors so that they can ask
questions more relevant to their objectives for the class to obtain meaningful information. This could help
give instructors more buy-in by giving them the opportunity to actively participate in the survey, and to
make good use of its results. All of these questions help measure whether or not the program is delivering
what it intended to deliver more effectively than the current ones.
Anonymous feedback mechanism
As of right now, the Language for the Degree Programs assumes that its students who have problems,
concerns or any feedback voice their concerns either to the Dean, the Language Chair or to their advisors.
The creation of an anonymous feedback mechanism would offer another outlet for students other than the
end of the semester survey to more openly and anonymously express their concerns and feedback about
the language programs and instructors. Students could give feedback on their own time instead of having
to make an appointments, especially if it is just feedback they would like to give instead of a particular
26
problem or concern in the quickest and easiest way possible. More of the problems students have in class
could be addressed well before the semester ends. Students would also less likely to use the survey to let
out all of their criticisms of the program, making it a more objective.
Conclusion
I have given an overview of School X’s evaluation processes to answer the meta-questions concerning
what effect the Language for the Degree Program had on students’ competency in language and analytical
skills and cultural knowledge. The Language for the Degree Programs measures the effectiveness of their
programs primarily using survey questions for their evaluation practices. Although it is important to know
to what extent the target group is satisfied with the program, this does not give enough meaningful
information for staff to improve the program and to hold the program accountable to its purposes. Several
evaluation tools and diagrams were introduced to look at and test the principle logic behind the Language
for the Degree Programs. Finally, recommendations were given in order to further develop the evaluation
practices within the Program. These recommendations have potential value for external utility and should
be applied to other schools, so that they may also improve their survey system and learn from the best
practices mentioned in this report. Incorporating sound evaluation practices will allow these organizations
to be more open to change and improve program functions.
Assessment is not easy and can be quite “messy” and, according to Michael Morris, it is an extremely
purposeful part of evaluation as it “enables departments to know themselves and to take action towards
enhanced student learning.”35 Although developing statements of purpose, setting objectives and creating
appropriate measures to determine how well the objectives are met seems very systematic, it is important
to remember that it can often be a difficult, iterative process. Past interventions such as the post-test and
portfolio that are proven to deliver information about program outcomes should be followed up on,
evaluated, re-designed and implemented again. It is imperative for the Language for the Degree Programs
to take a home-grown approach and even consider creating new tools to fit specific program needs and
limitations when figuring out which assessment instruments to use. In Michael Morris’s experience, a
Simulated Oral Proficiency Interview (SOPI) instead of OPI was more cost-effective and made more
sense given the size and the goals of the program.36 Each level had descriptors that articulated what the
program deems students should be capable of doing and still incorporated language adapted by the
ACTFL Oral Proficiency Guidelines and the National Standards for Foreign Language Learning.37 These
descriptors helped to create indictors that could actually measure program effectiveness.
After the failure of the Language for the Degree Programs’ past two interventions to measure learning
outcomes, there was little or no follow-up to re-design and implement them. Although these interventions
take place within a political realm, where there are “creatures of political decisions,” each having “a
political stance,” 38 thus making it difficult to take action, School X cannot afford to hold-off on
measuring language learning outcomes. If the strategic goal that the school set in 2006 of achieving
greater global impact in language learning and cultural understanding39 is to be achieved, increased
participation and support for effective program evaluation that can measure these impacts is imperative.
35
Morris, Michael. Addressing the Challenges of Program Evaluation: One Department’s Experience After Two Years. The
Modern Language Journal 90 (2006). Northern Illinois University.
36
Since the test did not adequately reflect the goals of the program, he made changes to the rubrics to include three different
levels (superior) exceeds program goals), proficient (meets program goals), and deficient (does not meet program goals).
37
The Modern Language Journal. Perspectives. Volume 90, Issue 4. 575 Winter 2006
38
39
http://www.miis.edu/about/governance/plan
27
Appendix A: Language Studies Evaluation Questions Spring 2007
Appendix B: Language component of the Alumni
Survey
28
Appendix C: Language Studies Experience Survey Questions
29
30
Appendix D: Language Studies Experience Survey Results Spring
2009
Appendix E: Coded answers Languages Portfolio Requirements
Non-Western Means/Percentag S.
– Jan 25, 2008 Draft
Question e D.
Program
I - What is the Purpose
Language of the LS Portfolio?
Semester 2.45 1.1
The goal of this language portfolio is to document your achievement and growth3 in
academic andLanguage
professional
level language of your choice during your period3.25
of study
0.8 at
School X. This professional portfolio may be used to demonstrate to your 2
Q1-1)
professors, Years of prior
colleagues, andstudy
prospective employers your ability to use5.78 2.8
the target
language and/or to operate professionally in it, to self-critique, and to become 7
Q1-2) Deciding factor? 73%
autonomous learners. The portfolio is a requirement for graduation as of the Fall
Why? Increase level of proficiency 20%
2007 class.
Combine LS and degree prog 27%
Associate higher value with LS 12%
II - What Should AssociateMy Portfolio
LS with MIISInclude?
brand 14%
Tied to careers 12%
Your portfolio Others
will consist of the following documents: xls
Q1-3) Plan for international career 88%
1. Title page
How Lang (with yourorname,
required program, specialization and graduation
recommended 27%date)
useful? Helpful product
2. One written in understanding
(paper,perspective
project, HW assignments) from 3% your first
Helpful in living,
semester of LS at School X. working, traveling 45%
others product (paper, project, HW assignments) from xls
3. One written your last
Q2-1) – Diverse student background 4%
semester of LS at School X.
likes Diverse topics and themes 13%
4. One oral presentation (audio clip, or video clip, with or without powerpoint)
Relevant to program, interest, career 15%
5. One Reflection on Your or
Excellent teaching Language
classroomLearning
experiencepaper, written in English
27% with
examplesLang inpartners
the target language
available 5%
Appropriate class size 4%
Each of the two written
Other likes products you submit should be at least 4-5 pages xls long for
300 level,
Q2-1) - 8-10Notpates for 400-level,
challenging enough 12-15 pages or more for 500-level, 13%written in
thedislikes Not at the
target language asgraduate
part of level
the course assignments. These written 2%products
Not enoughpaper,
could be a research classeswritten case study, policy memo, op-ed, 16% critical
Irrelevant to program, interest
summary, or any other written document. You must be the sole author 15%of these
products. In case of multiple drafts, please submit one clean copy15%
Uneven student levels
(your final
Class size too big 6%
product) and all the previous
Other dislikes
drafts that have comments and grades (if any).
xls
Course-specific comments 0%
Q2-2) Relevance of LS assignment to interest/study 3.39 1.2
Your ‘Reflection on Your Language Learning paper’ (3 to 4 pages, Times New 9
Roman
Q2-3) 12, 1.5 spacing)inshould
Internship foreign be written in English and is the culminating
country 72% piece
of What
your languagePeaceportfolio
Corp in which you describe your improvement in5% your oral
and kind? NGO assess your strengths and weaknesses, analyze specific
written skills, 9% ways in
which you have Government
changed agency and improved over the course of your work at 4%School X
English teaching
(comparing the two written products you chose above), distinguishing 2% your
Others xls
performance(s) in everyday language and professional language competency at
Role of Being nominated or selected 15%
thetarget
beginningEnabling
and endbetter of your language studies
performance on specific
at School X, and reflect
27%
upon the
rolelangof language
job/task learning for your area of specialization and overall learning
experience
proficienc atHelping
School X. You must
communicating also consider the growth in intercultural
in general 11%
understanding
y and your emotional development (your relationshipXls
Others with the
language,
Q3-1) your personal
Advanced orgrowth, etc.).
challenging As an(67/131)
courses appendix, please list all the
42%language
courses you took
expectati Highat School
level X.
of proficiency among cohort 36%
ons met (25/131)
Enjoyable learning experience (18/131) 72%
Excellent, experienced instructors
IV – Where and When should I send my Language Portfolio? (23/131) 76%
Well-organized program (22/131) 50%
Content-based language learning (33/131) 51%
Please submitEmphasis
your portfolio electronically to the language program head at the
on spoken language learning 62%
end of your last semester
(21/131) at School X. For some students, the language portfolio
willQ3-2)
be turnedNot in coded
as part forof the mandatory
missing responses dueportfolio.
to For others, please ask each
PH for details.redundancy
Q3-3) Importance to future career 4.27 0.3 31
V - Questions? Special Situation? 1
Separate x-cultural knowledge from language 14%
proficiency
Bibliography
Rossi, Peter H., Mark W. Lipsey, Howard E. Freeman, Evaluation A Systematic Approach: California:
Sage Publications, Inc., 2004.
McNamara, Carter. “Basic Guide to Program Evaluation.” Authenticity Consulting LLC.
Wright, Barbara. Learning Languages and the Language of Learning The Modern Language Journal
Volume 90, Issue 4 593–597, Winter 2006
Interview with a Dean 9/28/2010
Interview with a Dean 10/12/2010
Interview with Kathi Bailey, Kathi Bailey, President of the International Research Foundation for English
Language Education (TIRF) and professor of Applied Linguistics at the Monterey Institute of
International Studies, 9/30/2010
Interview with Alumni Relations 10/6/2/2010
Interview with language faculty 10/19/2010
Interview with language faculty 10/26/2010
32

A Graduate School&#39;s Evaluation Practices

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

A Graduate School&#39;s Evaluation Practices

Transféré par

Droits d'auteur :

Formats disponibles

Fall

Case Study: A Graduate School’s Language for the Degree Programs

Monterey Institute of International Studies

Evaluation Practices and Procedures………………………………………………………………….6-9

Measuring Language Learning Outcomes: best practices…………………………………………..10-17

The Logical Framework Matrix………………………………………………………………12-14

The Levinger Evaluation Design Matrix……………………………………………………........17

Analysis of Current Evaluation Practices…………………………………………………………..18-24

Evaluation Practices and Procedures

Figure 1: The Assessment Loop16

Measuring language-learning outcomes: best practices

Volume 90, Issue 4 594, Winter 2006

Figure 3: Student Learning Outcome Assessment

The Logical Framework Matrix

Narrative Summary Objectively Verifiable Indicators (OVI) Means of Critical Assumptions

Purpose -2nd language acquisition remains valued

-% change of School X alumni hired after X years based on -alumni survey

Activities (Inputs) Inputs as budgeted items:

Figure 6: Levinger Evaluation Design Matrix, Current Evaluation Practices

Key question: Terms to be operationally Key measureable Data collection

-# of written and oral

-# of students who drop out

Analysis of Current Evaluation Practices

Past Interventions: Post-test and Portfolios

Improving program functions

Increased participation in evaluation practices and processes

Diversifying the Means of Verification

Develop indicators using the QQTP

Redesign the survey

Anonymous feedback mechanism

McNamara, Carter. “Basic Guide to Program Evaluation.” Authenticity Consulting LLC.

Interview with a Dean 9/28/2010

Interview with a staff member 9/29/2010

Interview with a Dean 10/12/2010

Interview with Alumni Relations 10/6/2/2010

Interview with a staff member 10/08/2010

Interview with language faculty 10/19/2010

Interview with language faculty 10/26/2010

Vous aimerez peut-être aussi

A Graduate School's Evaluation Practices

A Graduate School's Evaluation Practices