Skip to main content

Vocabulary frequency: playing with the MultiLingProfiler tool

I thought I would try out the MultiLingProfiler tool linked from the website. You can find it here:

The idea is that you can test a text you have sourced or written to see how many words fall outside the 2000 most frequent words NCELP use for their vocab frequency bank. I copied and pasted a French text from, one I wrote for Higher Tier GCSE pupils. It's an interview with a female astronaut, adapted from an online source somewhere.

The tool highlights in orange any words which don't feature in the top 2000. Have a quick look at the text below. You'll note that the tool doesn't deal easily with verb chunks such as "avez-vous", so you can discount examples like that. Frequency counts (corpora) always produce surprising anomalies. So in the case below, words which you might be surprised to be in the top 2000 might include:

formation, partenaire, exigences, recueillir, fonctionner, quotidiennes, s'entraîner, affecté

Whereas words which ARE NOT in the top 2000 which might - stress might - surprise you are

mathématiques, ingénieurs, secondaire, mécaniques, pilote, incroyable

Now, we know that the sources of vocabulary frequency lists vary and may bear only partial resemblance to the language teenagers might encounter or want to know. So it would be surprising if there weren't apparent anomalies, when looked at from a teacher's point of view.

One thing which stands out to me (and this applies strongly the French) are the number of cognate words which help with comprehension. So the following words which are NOT in the top 2000 are nevertheless easy to work out from an English speaker's point of view. (The same would not necessarily apply to a speaker of a different language.)

astronaute, mathématiques, secondaire, navale, aviation, océanographique, mécaniques, pilote

This is based on an assumption that the teenager would know these words in English. They may struggle with aviation or océanographique.

Should we keep in mind the presence and frequency of cognates?

It seems to me that, while frequency is a very important factor to keep in mind when curriculum planning, you have to handle it carefully. You need to factor in the needs of the target audience, the availability of cognates (NCELP understandably decided to include them) and the thematic material you want students to hear, read, speak and write about.

The fact that NCELP is reluctant to specify topics or themes means that the frequency list they use may be too random or inappropriate in some ways for secondary learners. They rightly point out that texts largely contains high-frequency words (80% of words in a typical text would be from the top 2000), but you may still get anomalies.

NCELP have also mentioned that teachers should not be too slavish to frequency lists. This is true. Interesting material often contains rarer words and you'd be crazy not to teach them. Whether the 2000 words allow examiners to produce interesting, usable texts in papers remains to be seen. NCELP seem to think this will work if a small amount of glossing is allowed (up to 2% of the words in a text).

In the example below, you would need to gloss around a dozen words (not including highlighted chunks such as travaillez-vous). Copied into Word, the text comes out at 368 words. My maths suggests that I would need to gloss just over 3% - not too far off the NCELP figure. I could have simplified the text further too.

All this assumes that pupils would understand the high-frequency words, which of course they may or may not.

Anyway, just a bit of fun! You might like to try out the profiler yourself.

No one doubts that, for beginners especially, it's better to focus on common words rather than rare ones, but using published frequency lists can lead to some peculiar outcomes.

1. Depuis combien de temps travaillez-vous pour la NASA?

Je suis à la NASA en tant qu'employé du gouvernement depuis 13 ans.

2. Quel type de formation est nécessaire pour devenir astronaute?

Bien sûr, pour être considéré comme astronaute, on recherche non seulement des bases forts en mathématiques et en sciences, mais aussi une bonne éducation générale car on doit être capable de bien communiquer avec les ingénieurs et scientifiques. En plus il faut parler d’autres langues pour parler avec nos partenaires internationaux.

3. Y a-t-il des qualités physiques nécessaires pour être astronaute?

Vous ne pouvez pas être trop grand, ni trop petit. A part cela, il y a des exigences médicales. Il ne faut pas avoir une condition qui ne peut pas être traitée dans l'espace.

4. Comment êtes-vous devenue astronaute?

Eh bien, j’ai eu de la chance. Dès ma sortie de l'école secondaire, je suis allée à la US Naval Academy et je suis entrée dans l'aviation navale. J'ai volé avec une équipe de recherche océanographique où nous avons voyagé pour faire des expériences et recueillir des données océanographiques. Alors, j'ai quitté la Marine et suis allée travailler au Centre spatial Kennedy en tant qu'ingénieur et j'ai travaillé sur les systèmes mécaniques des navettes spatiales, avant d’être choisie pour être astronaute.

5. Combien de fois avez-vous été dans l'espace?

Je n’ai fait qu’un voyage jusqu'à présent sur la navette spatiale Columbia.

6. Quand vous êtes allée dans la navette Columbia, quelles ont été vos responsabilités?

J'étais spécialiste de mission et j'ai travaillé avec le pilote pour aider à faire fonctionner les systèmes de la navette spatiale, mais nous avons également transporté 25 expériences auxquelles j’ai participé.

7. Quelles sont vos responsabilités quotidiennes quand vous n'êtes pas dans l'espace?

Quand les astronautes ne sont pas dans l'espace, l'une des principales choses que nous faisons est de continuer à s'entraîner pour être prêt à être affecté à une autre mission dans l'espace.

8. Quelle est la partie préférée de votre travail?

C’est travailler ensemble en équipe. Quand on va dans l'espace, sur la navette spatiale en particulier, vous pouvez avoir sept membres d'équipage. On s’entraîne ensemble depuis des années et on développe une amitié incroyable avec ces personnes.


Popular posts from this blog

What is the natural order hypothesis?

The natural order hypothesis states that all learners acquire the grammatical structures of a language in roughly the same order. This applies to both first and second language acquisition. This order is not dependent on the ease with which a particular language feature can be taught; in English, some features, such as third-person "-s" ("he runs") are easy to teach in a classroom setting, but are not typically fully acquired until the later stages of language acquisition. The hypothesis was based on morpheme studies by Heidi Dulay and Marina Burt, which found that certain morphemes were predictably learned before others during the course of second language acquisition. The hypothesis was picked up by Stephen Krashen who incorporated it in his very well known input model of second language learning. Furthermore, according to the natural order hypothesis, the order of acquisition remains the same regardless of the teacher's explicit instruction; in other words,

What is "Input Processing"?

Input Processing (IP) was proposed by Bill VanPatten, Professor of Spanish and Second Language Acquisition from Michigan State University. Bill may be known to some of you from his podcast show Tea with BVP. He is one of those rare university academics who makes a specific effort to engage with practising teachers. IP was first proposed in a 1993 article (published with T. Cadierno in the Modern Language Journal) entitled "Input processing and second language acquisition: A role for instruction." My summary of it is based on an article "Input Processing and Processing Instruction: Definitions and Issues" (2013) by Hossein Hashemnezhad. IP is a little complicated to explain, but I'll do my best to summarise the key points before suggesting how it relates to other ways of looking at classroom language teaching. Is this actually any use to teachers? I apologise in advance for over-simplifying or misunderstanding. To paraphrase Dr Leonard McCoy from Star Trek &q

Pros and cons of pair and group work

Most teachers have made frequent use of pair and group work for many years, notably since the rise of communicative language teaching in the 1980s. Even before then it would have been common for pupils to work in pairs on simple role-play and dialogue tasks. So pair and group work is standard practice, if not universally supported by language teachers. It’s always worth evaluating, however, whether a practice works - whether, in this case, it helps students develop their proficiency. Pros Rod Ellis (2005) summarises the advantages of pair/group work (based on Jacobs, 1998) “1. The quantity of learner speech can increase. In teacher-fronted classrooms, the teacher typically speaks 80% of the time; in groupwork more students talk for more of the time. 2. The variety of speech acts can increase. In teacher-fronted classrooms, students are cast in a responsive role, but in groupwork they can perform a wide range of roles, including those involved in the negotiation of meaning. 3. There can

New MFL GCSE consultation

Updated on 7th April, with a few modifications to the original post written about a month earlier. ........................................................................... The DfE in England has recently published information about the proposed new GCSE exams, first teaching September 2023, first exams June 2025. There are two consultations going on, one regarding the subject content, and the other (much shorter) with respect to the assessment arrangements such as tiering.  The context is important here. DfE are worried about uptake in GCSE MFL, especially with their EBacc target of 90% uptake in mind. (This is highly unlikely to be achieved.) Therefore they would like an exam which makes the subject more attractive, both in terms of interesting content and accessibility (how easy it is thought to be). They are aware also of criticisms levelled at current papers that the exam is elitist, featuring too much subject matter which appeals to middle class students. Recall that MFL has be

An NCELP lesson resource analysed

NCELP (National Centre for Excellence for Language Pedagogy) is the body set up and financed by the DfE in England. based at the University of York and headed by Emma Marsden and Rachel Hawkes. It works through a number of hub secondary schools which, in turn, work with a small group of other schools. Their mission is, broadly speaking, to spread the research findings and principles as laid out in the Teaching Schools Council (TSC) Review of MFL Pedagogy from 2016. By sharing a selected body of research, considered relevant to secondary MFL in England, and creating schemes of work and lesson resources across the hub schools, they hope to spread so-called best practice around the country. As I write this, schemes of learning and lesson resources have been written up to the third term of Y8 for French, German and Spanish. I've been watching with interest as these resources have been built up and in general my view has been that the research resources are very useful and informative (