Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions Journal of Multilingual and Multicultural Development, 2013 Vol. 34, No. 7, 690708, http://dx.doi.org/10.1080/01434632.2013.796958 A model and questionnaire of language identity in Iran: a structural equation modelling approach Mohammad Khatib and Saeed Rezaei* English Language and Literature Department, Allameh Tabataba’i University, Tehran, Iran (Received 14 November 2012; final version received 27 March 2013) This study consisted of three main phases including the development of a Downloaded by [Tulane University] at 04:33 04 September 2014 hypothesised model of language identity in Iran, developing and validating a questionnaire based on this model and finally testing the model based on the questionnaire data. In the first phase of this research, a hypothesised model of language identity in Iran was developed based on the literature and consultations with a panel of experts. After that, a questionnaire was developed and validated to tap the components of the hypothesised model. In order to develop this questionnaire, the researchers went through a number of rigorous steps including content selection, item generation, writing the rating scales and personal informa- tion part, expert opinion, item revision, initial piloting, reliability estimation and finally validation. The results of the questionnaire administration indicated that the reliability of the questionnaire estimated through Cronbach’s alpha was 0.73 and exploratory factor analysis also identified six factors. As the final phase of this study, structural equation modelling through AMOS 21 was utilised to test the model. The initial results showed a poor fit model; however, the model was trimmed by removing one item from the questionnaire, and final statistical indices indicated that the model was fit. Keywords: language identity model; questionnaire; structural equation modelling; Iran; validity; reliability Introduction Language as an identification badge provides one of the best telling clues for people’s identity and where they belong to. This symbiotic relationship between language and identity is immensely supported in the literature and recent publications also corroborate this close affinity (Block 2007; Edwards 2009; Joseph 2004; Liamas and Watt 2010; Ricento 2005). In spite of this close relation between language and identity, the fuzziness and malleability of identity has limited the studies to mainly qualitative approaches. Since identity research was initially conducted by anthro- pologists, sociologists and psychologists, a review of studies in these disciplines indicate that quantitative approaches are more welcomed in these fields (e.g. Phinney 1992; Van Zomeren, Postmes, and Spears 2008). This is in sharp contrast to identity research in language studies where quantitative methods are usually neglected. Nevertheless, this tendency towards quantitative research in neighbouring disciplines has also affected language identity research and recent studies have also endorsed this trend (e.g. Ehala 2012; Polat and Mahalingappa 2010). Though qualitative *Corresponding author. Email: srezaei@sharif.edu # 2013 Taylor & Francis Journal of Multilingual and Multicultural Development 691 approaches such as ethnography, interviewing and diary studies have been very fruitful, the best way to overcome the shortcomings in these research methods seems to be developing a framework or model for research. Attempts to develop such models have been very successful and include models of language, culture and identity in different countries and contexts including Israel (Golan-Cook and Olshtain 2011). Considering all the above-mentioned arguments, this study pursued three main objectives. The first objective of this study was to develop a tentative hypothesised model of language identity in Iran. As the second objective of this study, a questionnaire was developed and validated to test the hypothesised model. Finally, in the last phase the data gathered through this questionnaire were fed into the model to see to what extent the model fit the data. Downloaded by [Tulane University] at 04:33 04 September 2014 Multilingualism, multiculturalism and identities in Iran Iran is a country in the Middle East with diverse languages, cultures and ethnicities. Persian is the national and official language in Iran, which is classified as one of the old languages descending from the Indo-European languages also labelled as an Indo-Iranian language. Persian is spoken as the national language not only in Iran but also in Afghanistan and Tajikistan where it is known as Farsi (Iran), Dari (Afghanistan) and Tajik (Tajikistan) (Beeman 2010). In spite of certain linguistic differences and researchers’ contentions about these varieties, this language is still known as Persian. However, there are some differences between Persian spoken in Iran, Afghanistan and Tajikistan including differences in lexicon, pronunciation, grammar and also writing system. As an example, Persian in Iran and Afghanistan is written in Arabic script but in Tajikistan it is written in Cyrillic script. Persian can be historically divided into three types including Old Persian, Middle Persian and Modern Persian. It is currently the mother tongue for almost 60% of the whole population in Iran, and the rest of the Iranians speak some other languages and distinct dialects including Azari (Turkish), Kurdish, Arabic, Lori, Gilaki and Balochi among many others (see Payne and Mahmoodi-Bakhtiari 2009; Windfuhr 2009). This variety of languages in Iran makes it a multilingual country with several types of ethnicities. The dominant ethnicities in Iran include Persians, Azaris, Kurds, Lors, Arabs, Baloch and Turkmen inter alia that make Iran a multiethnic and multicultural country. Another important factor in the Iranian identity is related to the significance of language and religion. Persian language along with Islam has historically been a salient factor in Iranian identity. According to Boroujerdi (1998), there are two opposing standpoints regarding Iranian identity: Iranian identity based on the religious view dominant in the country and Iranian identity as propagated by the secular intellectuals based on language or more clearly ethno-linguistic heritage as the main indicator of Iranian identity. From the first perspective Islamic Shia determines Iranian identity, and from the second perspective Persian language determines Iranian identity. In support of the second perspective, Yarshater (1993) contends that ‘it is only by loving, learning, teaching and above all enriching this language (Persian) that the Persian identity may continue to survive’ (142). In brief, Iran is a country with diverse languages (multilingualism), ethnicities and cultures (multiculturalism), which makes it a good site for sociolinguistic research. Although some previous studies have discussed the sociolinguistic issues of 692 M. Khatib and S. Rezaei Persian (Modarresi 2001), Kurdish (Hassanpour, Sheyholislami, and Skutnabb- Kangas 2012) and Azari (Bani-Shoraka 2009) in Iran, more research is needed for a fuller description of the sociology of languages and identities in Iran. The study Researching identity in applied linguistics is achieved through a number of methodological tools including interviews, ethnographic observation and question- naires inter alia (Rezaei 2012). Although interview and ethnography are two valuable research tools, they are usually time consuming and costly for administration and scoring. The potential practical problems inherent in interviewing and ethnographic observations make the use of validated questionnaires a viable solution. Although some researchers have used questionnaires as a way to collect their data in identity and attitudinal studies (e.g. Shaaban and Ghaith 2003), they have mainly Downloaded by [Tulane University] at 04:33 04 September 2014 employed open-ended questionnaires with little information on their validation and reliability index. On the other hand, so many researchers have remained the royal supporters of solely qualitative approaches to identity research and have shied away from quantitative or mixed-methods studies. Now that language studies are replete with complex constructs translated into quantifiable constructs and measures like language anxiety, language competence and language motivation, identity likewise can be researched with quantitative or more mixed-methods research tools. In other words, neighbouring disciplines like sociology, psychology and anthropology have done a lot to mix both qualitative and quantitative measures for exploring identity. Examples include measures of ego identity (Balistreri, Busch-Rossnagel, and Geisinger 1995), ethnic identity (Phinney 1992) and gay identity (Brady and Busse 1994). What remains to be added to this list is a good measure of language identity. Tentative and validated models of identity have also been proposed for different contexts including Israel (e.g. Golan-Cook and Olshtain 2011); however, no validated model is proposed for language identity in Iran. Therefore, in the next sections the steps to develop a model and questionnaire of language identity for Iranian context will be presented. A tentative model of language identity in Iran Developing a model followed by constructing a reliable and valid questionnaire can be very helpful for doing large-scale surveys. However, the contextual variations should also be taken into account when developing such a model or questionnaire. Hence, language identity can be defined with different components depending on the linguistic, sociological, anthropological and historical context of the language under investigation. Accordingly, the current researchers attempted to develop a model for language identity in Iran to encompass its relevant components. In order to accomplish this purpose, the researchers went through a number of rigorous and iterative steps. The initial step was to review the previous works and relevant theories to establish the theoretical framework for this study. One of the theories informing this study was the theory of bilingualism and bilingual education (e.g. Baker 2011; Dewaele, Housen, and Wei 2003) as Iranians participating in this study knew at least one language (Azari, Kurdish, Arabic, or English) besides Persian. The theories and studies on language and identity (e.g. Norton 2000) also made the cornerstone of this Journal of Multilingual and Multicultural Development 693 research as the whole research focused on identity and its relation with language. Globalisation and language teaching and learning issues (e.g. Block and Cameron 2002; Coupland 2010) were also utilised as language identity is predominantly affected by globalisation. Attitude is the main source to identity and as Dyer (2007) puts it, part of individual’s identity is formed through accent and phonology. In other words, language identity can be partially recognised through the accent, dialect or pronunciation that an individual adopts. Hence, part of the model provided here is devoted to pronunciation attitudes based on the works in the literature (e.g. Garrett 2010; Jenkins 2007). Besides, sociolinguistics of identity (e.g. Omoniyi and White 2006) and sociology of language (e.g. Bourdieu 1991; Giles and Clair 1979; Spolsky 2011) were informative to develop the model in this study because this study falls within the sociolinguistic domain of language studies and how language is a prevailing social factor in identity formation. Iranians consider Persian language as one of the main pillars of national identity Downloaded by [Tulane University] at 04:33 04 September 2014 in Iran. Hence, theories and works on language and national identity (e.g. Joseph 2004; Simpson 2007) were also useful. In addition, recent studies show that English language learners have started to adopt their local English types as legitimised forms of English and hence the works on World Englishes and postcolonialism (Brutt-Griffler 2002; Pennycook 1998) helped in composing some of the items in the questionnaire. One component of the model was related to how people associate their social status in the society to the language variety they adopt. The degree people associate their social status to the language (Persian, English, etc.) they speak is affected by the extent they value the language they use. In addition, language policy issues in the literature (e.g. Ricento 2006; Shohamy 2006; Spolsky 2003) were also used because in Iran the dominant language policy is to value Persian language written in Arabic script. Minority languages or non-official languages such as Azari and Kurdish are not recognised for instruction purposes at schools and universities in spite of the large number of people in Iran speaking these languages. Finally, some local works in Iran about Persian language and identity were helpful in shaping the model (e.g. Meskoob 1992). After reviewing the above-mentioned literature, a number of components were specified to encapsulate language identity in Iran. In order to confirm the representa- tiveness, appropriateness and accuracy of these components, a cadre of experts on linguistics, sociolinguistics and sociology in Iran and abroad were consulted. The interviews with these experts were held both in Persian and English and took between 30 to 60 minutes. The content of the interviews pivoted on the components of language identity in Iran. The interviewees were first asked what they constituted as language identity and the components they mentioned were written down. At the end of the interviews, the components they proposed were compared with the ones we had selected a priori. In some interviews at the end of the interview sessions, the priori selected components were shown to the interviewees to reflect on. This gave them food for thought to decide about what they had articulated and thereby helped them give more constructive comments. After all these substantive discussions, the components were respecified and reconfigured with some minor changes in the labels of the components and accordingly one new component was also added. Having reviewed the literature on language and identity, we drafted out six main components for language identity in Iran including attachment to the Persian language, pronunciation attitude, language and social status, L1 use/exposure in the 694 M. Khatib and S. Rezaei society, language knowledge and finally script/alphabet. Attachment to the Persian language can show how far Iranians love their own language, that is, the more they love their language, the higher the Persian language identity. In addition, part of language identity is related to the attitudes people have towards the pronunciation of their own language. This component was included in the model because it was aimed to see how far individual Iranian English language learners would like to adopt English pronunciation patterns when speaking Persian and if they favour Persian sounds and pronunciation patterns. Some Iranian English language learners adopt English pronunciation even when speaking Persian, which shows how they have been affected by English. Language and social status was the other component that showed how Iranians associate their social status to the language they converse in. Some people believe that English language can give individuals higher social status. Hence, the social status and language became part of the language identity model. L1 use and Downloaded by [Tulane University] at 04:33 04 September 2014 exposure in the society was another component that was related to the vitality of Persian language in the society and if Persian is used by the English language learners in the social context in Iran. Some English language learners in Iran become so mesmerised and attached to English language that they use English in their face- to-face or online daily conversations and communications. Language knowledge was also important because part of Iranians’ (language) identity is manifested in their Persian language and literature. In other words, poetry and literary texts have been very influential in Iranians’ identity. Finally, script and alphabet were included in the model because Iranians have been scribing in Arabic alphabet from the time Arabs invaded Persia in the seventh century. Since then Arabic script is used for writing; however, in the twentieth century a group of Iranian intellectuals proposed Latin to be used as the writing alphabet in Iran, which raised certain concerns in Iran and this proposal was finally rejected. Nowadays, many Iranians use Latinised Persian (Penglish) in their online communication or in their text messaging. There seems to be a strong proclivity in Iranian younger generation to write Persian in Latin alphabets. That is why script/alphabet was also included in the model to show how far Iranians prefer Latin or Arabic for their writing system. Table 1 below shows the definition for each of the identified components of language identity in Iran. Instrument development and validation The review of existing instruments and scales revealed that the ready-made instruments were not viable for the purpose of this study. As a result, the researchers decided to develop a questionnaire serving the objectives of this study to test the model of language identity hypothesised in the previous phase. As Dörnyei (2010) puts it, ‘developing a questionnaire is a stepwise process, and the quality of the final instrument depends on the cumulative quality of each sub-process’ (111). All the steps and stages of questionnaire development and validation were done according to the instructions given in manuals on questionnaire development by Brown (2001) and Dörnyei (2010). Respondents This study happened between July 2011 and August 2012, and the respondents were English language learners in Iran from different language proficiency levels, ages, Journal of Multilingual and Multicultural Development 695 Table 1. The hypothesised model with its components and definitions. Component Definition 1 Attachment to the Persian This component refers to how people in Iran think and feel language about their language in comparison to English language as the main foreign language. 2 Pronunciation attitude This component refers to Iranians’ attitudes towards their pronunciation patterns in Persian and English and which pronunciation they perceive as desirable. 3 Language and social status This component shows how individuals associate their social status with the language in which they speak. In other words, are they proud of their own language or do they associate their low or high social status to the language they speak? 4 L1 use/exposure in the It refers to the extent Iranians use Persian in their daily life in society comparison to other competing languages, in this case English. Downloaded by [Tulane University] at 04:33 04 September 2014 5 Language knowledge It refers to how much information Iranians have about their own language, its history and literature. 6 Script/alphabet It refers to how Iranians feel about the alphabet and writing system in their language. genders and educational backgrounds. The respondents to the questionnaire included 36 respondents for the initial piloting, 134 for the reliability, 193 for the exploratory factory analysis and finally 482 for the confirmatory factor analysis. A panel of five experts and nonexperts also commented on the wording of the items, content and construct of the questionnaire developed. The expert members of this panel were also consulted for the components of the model hypothesised and tested. Questionnaire development In order to develop a reliable and valid questionnaire, the researchers went through the following steps. Step One: item accumulation and item generation For any instrument development, the first and foremost step is to review the related literature. This step carries two purposes: (1) to review the existing instruments and (2) to establish a good theoretical framework for the instrument. In this study, these two objectives were already met for developing the model. Hence, the researchers went directly to generate a pool of items based on the hypothesised model. In order to develop the items, the researchers applied content sampling and multi-item scales. To have a representative sample of the content to be included in the questionnaire, the researchers also reviewed a dozen questionnaires in the literature. In this step, several items were generated because they could better measure or tap the target domain under investigation and also because the researchers already knew some items would be eliminated in the pilot study stage. To generate the items, the researchers did their best to generate simple and short items using natural language away from any loaded and ambiguous words. In addition, the researchers tried to avoid double-barrelled questions, that is, asking two or more questions in a single item. Care was also taken to not make the questionnaire too long. Hence after item generation, some of the overlapping items were removed. In order to generate 696 M. Khatib and S. Rezaei the items for the questionnaire, not only did the researchers check the related questionnaires already developed by others, but also asked a number of figures in the field working on language identity to provide some good items for the questionnaire (expert opinion). In addition, in developing the items the researchers tried to include the same number of positively and negatively worded items. In other words, some weak questionnaires might be developed in a way that most of the responses fall on either the negative or the positive side of the rating scale (e.g. strongly agree). In this questionnaire, the researchers avoided this bias and instead provided a balanced number of positively and negatively worded items. However, for later analyses these items went through reverse coding. Step Two: designing the rating scales Downloaded by [Tulane University] at 04:33 04 September 2014 The rating scale utilised in the current study was based on Likert scale as the most popular and widely used one named after its inventor, Rensis Likert. The researchers employed six options including strongly agree, agree, slightly agree, slightly disagree, disagree and strongly disagree. It should be mentioned here that the researchers initially opted for five-option type including: strongly agree, agree, no idea (undecided), disagree and strongly disagree. However, after reviewing the literature on question- naire development (e.g. Dörnyei 2010), the researchers came to know that Iranians are generally conservative in their responses  in spite of anonymity  and might merely choose ‘no idea: undecided’ in some seemingly sensitive items. As a result, six-option type was selected so that the respondents could not hedge. Another reason for doing so was making the data result in normal distribution. Respondents showed their degree of agreement/disagreement to each statement on a six-point Likert-type scale. To score the items, ‘strongly agree’ received six points, ‘agree’ five points, ‘slightly agree’ four points and so on. Scoring was reversed for the negatively worded items. Step Three: designing the personal information part The personal demographic information in this questionnaire included information about gender, age, pronunciation attitude, language proficiency level, length of study, education level, location (city and province), ethnicity, first language and length of stay abroad. Though most researchers put personal information at the beginning of the questionnaire, this might affect the respondents’ responses and be considered as somehow off-putting for some respondents. However, after they have answered the items, they might more comfortably respond. That is why the personal information section was put at the end of the questionnaire. This part of the questionnaire was designed for a later study on exploring language identity in Iran and its relation with their demographic information. Step Four: item checking with experts After the items were generated in English in the previous steps, the researchers recruited a panel of five experts and nonexperts to check its intelligibility and accuracy. Nonexperts were included because their ideas would be helpful to remove unnecessary jargon and loaded words from the questionnaire. Since the final respondents to the questionnaire were nonexperts too, the feedback from the Journal of Multilingual and Multicultural Development 697 nonexperts was as valuable as the one from the experts. Content representativeness and bias were simultaneously investigated. This panel of experts included profes- sionals in the field of applied linguistics, sociolinguistics, sociology, Iranian studies and survey design and statistics. The panel of experts were requested to rate the items based on a Likert-type scale from one to four. In this scale, one designated ‘Not important to be included in the survey’, two was ‘Somehow important to be included’, three ‘Important to be included’ and finally four meant ‘Extremely important to be included in the survey’. They were additionally asked to pen in a final decision on the item by selecting either ‘omit’ or ‘keep’ the item as the final decision on each item. The results of the responses obtained from this step reduced the items from 40 to 26 items. Subsequently, 14 items were discarded due to a number of reasons mentioned by the panel including the redundancy, ambiguity, length and irrelevance of the items. The criteria to keep an item or omit it from the questionnaire were based Downloaded by [Tulane University] at 04:33 04 September 2014 on the panel of experts’ opinions. If the majority chose ‘Important’ or ‘Extremely important to keep the item’, the item was subsequently kept. If the majority demanded the item to be omitted or found the item ‘Not important or somehow important to be included’, the item was deleted. As a general rule in this study, items receiving more than 70% of acceptability were kept for the next step. Step Five: item translation and revision After the revisions and modification, the researchers translated the items into Persian so that the future respondents who were to be from different language proficiency levels would be able to complete the questionnaire. Initially the researchers intended to administer the questionnaire in English but due to the inclusion of English language learners from low-proficiency levels, the questionnaire was translated to be easier to complete. After the translation was done, items were checked for possible ambiguities. A PhD candidate of Persian literature who was also the editor of some literary journals and magazines was asked to edit the Persian version of the questionnaire and make it standard Persian. Although the questionnaire was translated into Persian for the sake of ease for the participants, both English and Persian questionnaires were tested for their reliability and validity. Back translation was also applied to make sure about the transfer effect from English to Persian. Step Six: initial piloting and item analysis Before moving further ahead, the researchers took the following points into consideration. Regarding the length of the questionnaire, the researchers did their best to be short but not to the point of eliminating the central points. This goal was achieved by having the questionnaire take less than 20 minutes to complete. If a questionnaire takes more than that to fill out, it might make the respondents reluctant to cooperate fully. The questionnaire respondents were informed that the information elicited would be kept anonymous so that they would feel relaxed to answer to the potentially sensitive items in the questionnaire. Moreover, the researchers tried to put the more sensitive items to the end of the questionnaire so that they would not discourage the respondents to respond to the items. 698 M. Khatib and S. Rezaei The title of the questionnaire, that is, language identity questionnaire, was removed during the administration because it might have affected the participants’ responses. In developing the items, the researchers were also careful not to make double- negative items because they would sometimes make the items confusing. Age, education, language proficiency level, etc. were initially generated as open- ended in this questionnaire. However, they were later turned into pre-determined categories to ease later analyses. After considering all the above points, the questionnaire was administered for an initial piloting. Up to this point, 26 items were generated. Since this was the initial pilot study, the questionnaire was administered to 36 students similar to the target population for which the questionnaire was designed. In order to administer the questionnaire, the researchers used the traditional method, that is, by hand. The feedbacks were very helpful in modifying some of the items and discarding one. Downloaded by [Tulane University] at 04:33 04 September 2014 Hence the remaining questionnaire included 25 items. Step Seven: reliability index In order to measure the internal consistency of the questionnaire in this study, Cronbach’s Alpha coefficient was utilised. In order to make a decision about a correlation level for an acceptable reliability, several studies were reviewed with various acceptable benchmarks suggested. Following Dörnyei (2010), the current study chose below 0.60 as weak and above that as an acceptable measure for the reliability index of the questionnaire. The questionnaire consisted of 25 items and was administered to 134 Iranian English language learners. The results for the Cronbach’s Alpha showed that the internal consistency of the whole questionnaire was 0.73 and for the six subscales (i.e. the six components of language identity) in the questionnaire, it was estimated to be 0.62, 0.65, 0.70, 0.72, 0.71 and 0.68, respectively. The results of Cronbach’s Alpha indicated that five of the items reduced the reliability of the whole questionnaire dramatically and hence were excluded from the questionnaire. Other items that seemed to reduce the reliability were intentionally kept intact because the researchers believed those items were important and an acceptable level of reliability was already met. It should also be mentioned that the response rate was 97%, and the main reason for this high response rate was the presence and direct observation of one of the researchers in the data collection site. Step Eight: validation The validation process of the questionnaire was based on Alderson and Banerjee (1996) and Converse and Presser (1986). The main types of validity for questionnaire validation investigated in the current study were face validity, content validity and construct validity. Response, predictive and concurrent validities were not investi- gated because they were not applicable in this study. Not only should a questionnaire be short, but also nice to the eyes. In order to fulfil this purpose, that is, face validity, the researchers tried to employ a good layout, font type, margin, colour, etc. Subsequently the face validity of the questionnaire was met by considering these issues and checking them with the previous validated questionnaires in the literature. Journal of Multilingual and Multicultural Development 699 Downloaded by [Tulane University] at 04:33 04 September 2014 Figure 1. Scree plot. To establish the content validity of the questionnaire, the questionnaire was given to a panel of experts, as discussed above, to judge how far the items were repre- sentative of a language identity questionnaire. Moreover, the experts reflected on the wording and the interpretation of the items, and also the instructions given there. To check the content validity, the questionnaire was also given to five English language learners from the target population to respond to using think-aloud techniques. After running these two stages for checking the content validity, some changes were implemented in the items resulting in rewording of some items. All these changes, that is, face and content validity, were made prior to the reliability phase. In other words, content validity was done before estimating the reliability. After all these steps, the researchers came up with 20 items tapping the six components of language identity in Iran. Table 2 below shows the six components in the questionnaire,1 their related items and their reliability indices. In order to establish the construct validity, two procedures were employed. At first the questionnaire was checked for its congruency with the theories in the literature regarding language and identity as discussed above. This step was done iteratively by checking the items with the researchers in the literature. Next, exploratory and confirmatory factor analyses were utilised in two separate administration occasions to statistically check the validity. Nevertheless, a number of criteria must be met before running factor analysis. The first step in factor analysis is to assess the suitability of the data for factor analysis. In order to determine the suitability of the data for factor analysis, two criteria must be met; ‘sample size and the strength of association among the variables (or items)’ (Pallant 2007, 180). Regarding the sample size, there are different views among researchers, the most conservative of which is the larger the better. In this study, the criterion was that of five to ten respondents for each item. One hundred and ninety-three participants who took part in the exploratory factor analysis phase met this criterion. The second criterion concerning the suitability of running factor analysis is related to the inter-correlations among the items in the questionnaire. Bartlett’s test of sphericity and the Kaiser-Meyer-Olkin (KMO) measure determine this criterion. In order for these two options to indicate factorability for the data, Bartlett’s test of sphericity should be significant, that is, p B0.05 and KMO index that ranges from 700 M. Khatib and S. Rezaei Table 2. Questionnaire components, their related items and reliability indices. Component Related items in the questionnaire Reliability 1 Attachment to (1) I wish all my courses at school/university 0.62 Persian language were taught in English rather than Persian. (2) I like to attend Persian classes more than English ones. (3) I love Persian language and I don’t like English to take its place. 2 Pronunciation (4) I think speaking English with a Persian 0.65 attitude accent is not bad. (5) I feel proud of speaking Persian with an English pronunciation. (6) I like Persian pronunciation more than English pronunciation. 3 Language and social (7) I believe a person who can speak English very 0.70 Downloaded by [Tulane University] at 04:33 04 September 2014 status well has a better social status and respect in the society. (8) I believe knowing English shows being respectful. (9) When I speak English I feel I am superior to others. 4 L1 use/exposure in the (10) I speak English a lot in my daily life. 0.72 society (11) I use English words a lot when I speak Persian. (12) I like to speak Persian rather than English with foreigners who know Persian. (13) I like to speak English rather than Persian with my Iranian friends who know English. (14) I read English texts more than Persian texts. 5 Language knowledge (15) I like to know more about the history of 0.71 Persian language rather than that of English language. (16) I like to know more about Persian poets and writers rather than English ones. (17) I read poetry and stories in Persian a lot. 6 Script/alphabet (18) I send text-messages and e-mails in English. 0.68 (19) I like Persian alphabets more than English ones. (20) I liked we wrote Persian in Latin alphabets. 0 to 1 should not be below 0.6, otherwise the data will not be considered appropriate for running factor analysis. For the current study as shown in Table 3, the KMO and Bartlett’s test result showed that KMO measure was above 0.60 (KMO 0.76) and also the Bartlett’s test of sphericity was significant (p 0.00). These two values assume that there are some significant factors to be extracted from the data. Table 3. KMO and Bartlett’s test results. KMO and Bartlett’s test KMO measure of sampling adequacy 0.76 Bartlett’s test of sphericity Approximately chi-square 754.16 df 190 Significance 0.00 Journal of Multilingual and Multicultural Development 701 After determining the factorability of the data, factor analysis was run based on principle components analysis (PCA). In order to decide about the number of factors to be retained, the Kaiser’s criterion according to which only the eigenvalues of 1.0 and more were selected. For the current questionnaire, the scree plot in Figure 1 indicates 6 factors above eigenvalue 1. The six factors accounted for 77.24% of the total variance (usually anything over 60% is good in this case). These six factors accounted for 28.96%, 18.93%, 9.31%, 8.64%, 6.40% and 5.28% of the total variance, respectively. Variable communalities were greater than 0.30 for all the items. Communality values for this questionnaire ranged from 0.53 to 0.74. The factor correlations for the questionnaire were all at acceptable levels with the highest correlation between factor 1 and factor 4 (r 0.71), followed by 1 and 2 (r 0.70), 2 and 3 (r 0.68), 3 and 4 (r 0.67), 1 and 6 (r 0.65), 4 and 5(r 0.62), 1 and 3 (r 0.55), 2 and 5 (r 0.55), 4 and 6 (r 0.53), 3 and 6 (r 0.50), 2 and 6 (r 0.48), 1 and Downloaded by [Tulane University] at 04:33 04 September 2014 5 (r 0.45), 2 and 4 (r 0.45), 3 and 5 (r 0.44), 5 and 6 (r 0.44). The results of factor analysis based on PCA, as shown in Table 4, indicates the six factors and how clean they are loaded and the degree of their loadings. Some cross-loadings were also observed. Some of these cross-loadings were neglected because they were usually loaded way higher on one factor than the other. Nonetheless, four cross-loadings were very close on two separate factors. For instance, items 10 to 14 were loaded on both factors 1 and 4. This occurred because of the close relations between the two factors. Items 1014 can show both the learners’ attachment to the Persian language and also their exposure to the language and the extent they use Persian. In other words, it can be partially justified by considering that the more individuals use Persian, the more attached they are towards it. However, as shown in Table 4 the loadings are still higher on factor 4 than factor 1. After checking the factor loadings, items that do not load highly on any of the factors are to be eliminated from the questionnaire. In this phase of questionnaire administration, all the items were acceptably loaded on the six factors. Confirmatory factor analysis and testing the model fitness After doing exploratory factor analysis, a confirmatory factor analysis was run to check if the questionnaire data fit the model hypothesised at the outset of the study. In other words, the questionnaire was once again administered to 482 language learners. The questionnaire for this phase of research was uploaded on SurveyMonkey (www.surveymonkey.com) and the participants filled out the questionnaire online. Structural equation modelling (SEM), which is a multivariate analysis technique for exploring causality in models and the causal relations among variables, was run. SEM is rooted in the positivist epistemological belief that was cobbled together out of regression analysis, path analysis and confirmatory factor analysis. SEM is used as a confirmatory technique to test models that are conceptually derived a priori or test if a theory fits the data. SEM shows the relationship between latent variables, that is, the components of language identity in this study, and the observable variables, that is, the items in the questionnaire generated for each of the components in language identity construct. 702 M. Khatib and S. Rezaei Table 4. Factor loading based on PCA. Componenta 1 2 3 4 5 6 v1 0.655 v2 0.623 v3 0.879 v4 0.567 v5 0.675 v6 0.723 v7 0.574 v8 0.562 v9 0.598 v10 0.553 0.565 v11 0.617 0.656 Downloaded by [Tulane University] at 04:33 04 September 2014 v12 0.682 0.687 v13 0.603 0.623 v14 0.643 0.662 v15 0.752 v16 0.684 v17 0.644 v18 0.589 v19 0.763 v20 0.739 Note: Extraction method: PCA. a Six components extracted. In order to test the hypothesised model, AMOS 21 was run and maximum likelihood method was used to estimate the parameters. The participants who took part in this part of the study were 482 English language learners who ranged in age from 15 to 35 years with a mean age of 22 years. They were from different parts of the country possessing different demographic characteristics. The researchers deliberately did so to test the model for the whole country rather than limiting it to the capital city. Table 5 shows the descriptive statistics (e.g. age, gender, ethnicity and language proficiency level) for the participants in this phase of study. As can be seen in the data presented in Table 5, some of the respondents did not fill out the parts about their demographic information (missing data) and subsequently 468 participants were included for SEM. The demographic information gathered here was intended to be included as variables in the model. Nevertheless, these variables were excluded from the model to avoid complexity. For models  such as the one in this study  at their nascent stages, it is highly suggested not to make them convoluted. However, future studies2 can utilise these variables as latent and hence discover the relations among all these variables. In order to report the model fitness, there are three common absolute fit indices including: - x2 according to which nonsignificant x2 (p 0.05) indicates good fit; - Root Mean Squared Error of Approximation (RMSEA); acceptable fit B0.10 and good fit B0.05; hence the smaller the RMSEA, the better and fitter the model is; and - Goodness of Fit (GFI) 0.90 is considered as good fit. Downloaded by [Tulane University] at 04:33 04 September 2014 Journal of Multilingual and Multicultural Development Table 5. Demographic information of the participants for the confirmatory factor analysis phase. Age Gender Ethnicity Language proficiency 1115 1620 2125 25 Total Male Female Total Fars Azari Kurd Arab Lor Other Total Basic Elementary Pre-inter Inter High Inter Advanced Total 46 56 128 249 479 211 268 480 298 58 41 40 33 5 475 22 46 52 67 92 200 479 9.6% 11.7% 26.7% 52%  43.9% 55.8%  62.7% 12.2% 8.6% 8.4% 6.9% 1%  4.5% 9.6% 10.8% 13.9% 19.2% 41.7%  703 704 M. Khatib and S. Rezaei In this study, absolute fit indices were taken into account because there was no previous model to test this model against. The initial results of SEM showed poor fitness for the model. The reasons for this lack of fitness were related to the complexity of the model, which included not only the six factors but also some demographic information. Hence, some changes were made in the model to make it fit the data. These changes included removing some of the restrictions in the model including the demographic information and instead focusing on the main factors proposed a priori. In addition, one of the items (item 12) was removed because it showed low factor loadings. Hence, the model was revised and SEM was once again run. The output of the second SEM showed x2 4.42, df 155, p 0.02, which shows a significant value for Chi-square. Since Chi-square value is dependent on sample size and is usually significant for 400 samples and more, x2/df is used as a solution, which is 4.42/155  0.02 and is considered as an acceptable degree (see Table 6). The results of the second SEM also indicated GFI 0.974 and RMSEA  Downloaded by [Tulane University] at 04:33 04 September 2014 0.00, which were also acceptable. Table 6 shows the indices for SEM and shows a desirable level of fitness based on the output from AMOS 21. Hence, all the indices are at an acceptable level and the model seems to be a fit model. In other words, the data gathered in this study seem to support this model. Figure 2 shows the schematic representation of the recursive model of language identity in Iran. Path coefficients are also put on the pathways from each latent variable to other latent or observable variables to show the strength of relation or correlation among the variables. Discussion and conclusion The main goal of this study was to develop a model and test its fitness through a validated questionnaire. Hence, a model was initially hypothesised and later tested through a valid questionnaire. The results of this study showed that though the model had been the first one developed for the Iranian context, it enjoyed a reasonable degree of reliability and validity as confirmed by the statistical indices from SEM. The questionnaire also displayed a respectable degree of reliability and validity for future use in the Iranian context. Both the model and the questionnaire developed and validated in this study can have many uses and applications for future researchers. First of all, although both the model and the questionnaire are for the Iranian context, judicious changes can make them useful for other contexts too. Researchers from other linguistic contexts can also use the steps in this study to develop and validate similar models and questionnaires for other linguistic contexts. However, contextual variances should be considered and subsequently the model and the questionnaire should be rechecked for their reliability and validity. In spite of the statistical confirmation for the reliability and validity of the model and the Table 6. Selected fit measures for the final model. Index Current level Accepted level x2 4.42 p 0.05 x2/df 0.02 B3 GFI 0.98 0.90 RMSEA 0.00 B 0.05 Journal of Multilingual and Multicultural Development 705 Downloaded by [Tulane University] at 04:33 04 September 2014 Figure 2. Final model of language identity for English language learners in Iran. Note: F1, F2, F3, F4, F5 and F6 are the factors identified in EFA. questionnaire, the researchers hereby recommend that more rigorous studies are required to test this model and probably add more components and subcomponents to this model as is the case with other models in language studies (e.g. communicative competence model). In other words, although the data gathered in this study through a reliable and valid questionnaire seem to have fit the model, this should not make this model vaccinated for any other deficiencies. The researchers believe that other replication studies, collecting data from different groups of Iranians are required to reduce confounding variables and subsequently enhance the reliability and validity of this model. 706 M. Khatib and S. Rezaei Moreover, developing such a model would also be a move towards quantitative approaches in identity research where a more tangible picture of identity is obtained. In other words, the questionnaire developed in this study carries certain advantages over other methodological tools for identity research. One of the main benefits of developing such a questionnaire is its speed of data collection and objective scoring. In addition, the data can be much more easily extrapolated. This ques- tionnaire is intended to be used to further explore language identity in Iran and how demographic information can affect language learners’ identity in Iran. One last important note to be borne in mind is that this questionnaire is not recommended to be used as the sole data collection instrument in research studies but should be accompanied by interviewing in order to complement the shortcomings in the data gathered through questionnaires. Downloaded by [Tulane University] at 04:33 04 September 2014 Acknowledgements Acknowledgements are to the Iranian Ministry of Science, Research, and Technology that funded and supported this research. We would also like to thank professor Ingrid Piller at Macquarie University for her support, critical insights and intellectual guidance. Special thanks are also extended to Dariush Izadi for his help and encouragement during this project. Notes 1. The complete version of the questionnaire in both English and Persian is available upon request. 2. This study is part of a larger PhD project and the model and the questionnaire developed here are utilised in a follow-up nation-wide survey to study the language identity level of Iranian English language learners from different age groups, genders and language proficiency levels. References Alderson, C. J., and Banerjee, J. 1996. How Might Impact Study Instruments be Validated? Unpublished manuscript commissioned by UCLES. Baker, C. 2011. Foundations of Bilingual Education and Bilingualism. 5th ed. Bristol: Multilingual Matters. Balistreri, E., N. A. Busch-Rossnagel, and K. F. Geisinger. 1995. ‘‘Development and Preliminary Validation of the Ego Identity Process Questionnaire.’’ Journal of Adolescence 18 (2): 179192. doi:10.1006/jado.1995.1012. Bani-Shoraka, H. 2009. ‘‘Cross-Generational Bilingual Strategies among Azerbaijanis in Tehran.’’ International Journal of the Sociology of Language 198: 105127. doi:10.1515/ IJSL.2009.029. Beeman, W. O. 2010. ‘‘Sociolinguistics in the Iranian World.’’ In The Routledge Handbook of Sociolinguistics in the World, edited by M. J. Ball, 139148. London: Routledge. Block, D. 2007. Second Language Identities. London: Continuum. Block, D., and D. Cameron, eds. 2002. Globalization and Language Teaching. London: Routledge. Boroujerdi, M. 1998. ‘‘Contesting Nationalist Constructions of Iranian Identity.’’ Critique 12: 4355. doi:10.1080/10669929808720120. Bourdieu, P. 1991. Language and Symbolic Power. Translated by G. Raymond and M. Adamson and edited by J. B. Thompson. Cambridge: Polity Press. (original work published in 1982) Brady, S., and W. J. Busse. 1994. ‘‘The Gay Identity Questionnaire.’’ Journal of Homosexuality 26 (4): 122. doi:10.1300/J082v26n04_01. Journal of Multilingual and Multicultural Development 707 Brown, J. D. 2001. Using Surveys in Language Programs. Cambridge: Cambridge University Press. Brutt-Griffler, J. 2002. World English: A study of Its Development. Clevedon: Multilingual Matters. Converse, J. M., and Presser, S. 1986. Survey Questions: Handcrafting the Standardized Questionnaire. London: Sage. Coupland, N., ed. 2010. The Handbook of Language and Globalization. Malden, MA: Wiley- Blackwell. Dewaele, J. M., A. Housen, and L. Wei. 2003. Bilingualism: Beyond Basic Principles: Festschrift in Honour of Hugo Baetens Beardsmore. Clevedon: Multilingual Matters. Dörnyei, Z. 2010. Questionnaires in Second Language Research: Construction, Administration, and Processing. 2nd ed. London: Routledge. Dyer, J. 2007. ‘‘Language and Identity.’’ In The Routledge Companion to Sociolinguistics, edited by C. Liamas, L. Mullany and P. Stockwell, 101108. London: Routledge. Edwards, J. 2009. Language and Identity: An Introduction. Cambridge: Cambridge University Press. doi:10.1017/CBO9780511809842. Downloaded by [Tulane University] at 04:33 04 September 2014 Ehala, M. 2012. ‘‘Cultural Values Predicting Acculturation Orientations: Operationalizing a Quantitative Measure.’’ Journal of Language, Identity, & Education 11 (3): 185199. doi:10.1080/15348458.2012.686388. Garrett, P. 2010. Attitudes to Language. Cambridge: Cambridge University Press. doi:10.1017/ CBO9780511844713. Giles, H., and R. N. St. Clair, eds. 1979. Language and Social Psychology. Oxford: Basil Blackwell. Golan-Cook, P., and E. Olshtain. 2011. ‘‘A Model of Identity and Language Orientations: The Case of Immigrant Students from the Former Soviet Union in Israel.’’ Journal of Multilingual and Multicultural Development 32 (4): 361376. doi:10.1080/014346 32.2011.579128. Hassanpour, A., J. Sheyholislami, and T. Skutnabb-Kangas. 2012. ‘‘Introduction Kurdish: Linguicide, Resistance and Hope.’’ International Journal of the Sociology of Language 217: 118. doi:10.1515/ijsl-2012-0047. Jenkins, J. 2007. English as Lingua Franca: Attitude and Identity. Oxford: Oxford University Press. Joseph, J. 2004. Language and Identity: National, Ethnic, Religious. Basingstoke: Palgrave Macmillan. Liamas, C., and D. Watt. 2010. Language and Identities. Edinburgh: Edinburgh University Press. Meskoob, S. 1992. Iranian Nationality and the Persian Language. Washington, DC: Mage Publishers. Modarresi, Y. 2001. ‘‘Aspects of Sociolinguistics in Iran.’’ International Journal of the Sociology of Language 148: 13. doi:10.1515/ijsl.2001.012. Norton, B. 2000. Identity and Language Learning: Gender, Ethnicity, and Educational Change. Harlow: Pearson Education. Omoniyi, T, and G. White, eds. 2006. The Sociolinguistics of Identity. London: Continuum. Pallant, J. 2007. SPSS Survival Manual: A Step-by-step Guide to Data Analysis Using SPSS for Windows. 3rd ed. McGraw Hill: Open University Press. Payne, J. R., and B. Mahmoodi-Bakhtiari. 2009. ‘‘Iranian Languages.’’ In The World’s Major Languages, 2nd ed., edited by B. Comrie, 437444. London: Routledge. Pennycook, A. 1998. English and the Discourses of Colonialism. London: Routledge. Phinney, J. S. 1992. ‘‘The Multi Group Ethnic Identity Measure: A New Scale for Use with Diverse Groups.’’ Journal of Adolescent Research 7 (2): 156176. doi:10.1177/0743 55489272003. Polat, N., and L. J. Mahalingappa. 2010. ‘‘Gender Differences in Identity and Acculturation Patterns and L2 Accent Attainment.’’ Journal of Language, Identity, and Education 9 (1): 1735. doi:10.1080/15348450903476832. Rezaei, S. 2012. ‘‘Researching Identity in Applied Linguistics.’’ Journal of Language, Culture and Society 35: 4551. http://www.aaref.com.au/en/publications/journal/journal-articles/ issue-35-2012/. 708 M. Khatib and S. Rezaei Ricento, T. 2005. ‘‘Considerations of Identity in L2 Learning.’’ In Handbook of Research in Second Language Teaching and Learning, edited by E. Hinkel, 895911. Mahwah, NJ: Lawrence Erlbaum. Ricento, T., ed. 2006. An Introduction to Language Policy: Theory and Method. London: Blackwell. Shaaban, K., and G. Ghaith. 2003. ‘‘Effect of Religion, First Foreign Language, and Gender on the Perception of the Utility of Language.’’ Journal of Language, Identity, and Education 2 (1): 5377. doi:10.1207/S15327701JLIE0201_3. Shohamy, E. 2006. Language Policy: Hidden Agendas and New Approaches. London: Routledge. Simpson, A. 2007. Language and National Identity in Asia. Oxford: Oxford University Press. Spolsky, B. 2003. Language Policy. Cambridge: Cambridge University Press. doi:10.1017/ CBO9780511615245. Spolsky, B. 2011. ‘‘Ferguson and Fishman: Sociolinguistics and the Sociology of Language.’’ In The Sage Handbook of Sociolinguistics, edited by R. Wodak, B. Johnstone and P. E. Kerswill, 1123. London: Sage. Downloaded by [Tulane University] at 04:33 04 September 2014 Van Zomeren, M., T. Postmes, and R. Spears. 2008. ‘‘Toward an Integrative Social Identity Model of Collective action: A Quantitative Research Synthesis of Three Socio- psychological Perspectives.’’ Psychological Bulletin 134 (4): 504535. doi:10.1037/0033- 2909.134.4.504. Windfuhr, G. L. 2009. ‘‘Persian.’’ In The World’s Major Languages, 2nd ed., edited by B. Comrie, 445459. London: Routledge. Yarshater, E. 1993. ‘‘Persian Identity in Historical Perspective.’’ Iranian Studies 26 (12): 14142. doi:10.1080/00210869308701791.