Category: Uncategorized (Page 1 of 19)

LAVIS5 / SECOL91

I had a good time at LAVIS5 / SECOL91 this weekend, a linguistics conference about the southern United States. Louisiana was well represented. Here’s my contribution.

The dive into open science.

Finally, I decided to do a project following open science guidelines. I’ve spoken about this in a rather old post no. Then, I was describing a workflow that resembling software development. Unfortunately, even though I’ve implemented this workflow since, I’ve always felt too rushed to do any preregistrations or even submit my projects to the IRB, but for my first qualifying paper, I want to do it right. You can follow the progress on the OSF.

Master’s thesis now available.

I’m very happy to be able to say that my master’s thesis, entitled LOL sur Twitter: une approche du contact de langues et de la variation par l’analyse des réseaux sociaux, has been published to the digital library website at UQAM. If you’re interested in linguistic variation, French on Twitter, social network analysis as it applies to language contact, or simply internet abbreviations like lol, please download it and read it. You can find it in the following locations:

Working your flow.

Grad school is essentially a juggling act: you not only have to perform in your classes, but you have to work as a TA or RA, do your own research, and make sure that you nurture your social life so that you don’t go insane. One way to make this task much more manageable can be summed up in the immortal words of Scrouge McDuck:

One way to work smarter, not harder, is to make sure you’re using the right technological tools. For me, in particular, two issues stuck out while writing my master’s thesis:

  1. Finding a way to easily collaborate with my committee members
  2. Finding a way to deal with numerous drafts

These were sort of interrelated issues, too. At times, I had already written several more drafts by the time I got comments back on a draft I had e-mailed to a committee member, which could lead to a bit of confusion or simply waste time if I had already spotted and corrected an issue that was commented on. These issues really apply to all research, though, as all research can (and to some degree probably should) involve collaboration and numerous rewrites.

My solution to these two issues has been to develop a workflow that treats writing a paper more like writing software. In a way, I mean this literally, as my current workflow involves programming rather than using Word to write papers or Excel to manage data or something like SPSS to perform statistical analyses. These programs are nice in that they’re pretty easy to learn, but they also don’t integrate very smoothly and transparently with each other, don’t play well with the internet/cloud systems, and create all sorts of compatibility issues if your collaborators aren’t using the same exact tools. The alternative is using tools that involve only standardized, text-based file types. This means learning how to code. I know that sounds scary for many people, and the learning curve is certainly higher than figuring out where the table button is in Word, but the learning curve is in my opinion often overstated, and the payoff of overcoming that curve is pretty great. Remember, work smarter, not harder.

The first component of my current workflow is the website ShareLaTeX.1 This site allows you to produce .pdf documents using a markup language called LaTeX. A markup language is a very simple programming language that lets you format plain text by occasionally inserting little tags when you need to make something bold or create a header or whatever. You write your text up with your little tags in an input file, in this case a .tex file, which then spits out a .pdf document when you compile it. For instance, if I wrote the following in my .tex file:

This is my super cool first section where I talk about DuckTales.
\section{Get to the point}
This is my next section where I get to the point and say something useful.

I would get a .pdf that looks something like this:

This is my super cool first section where I talk about DuckTales.

Get to the point

This is my next section where I get to the point and say something useful.

That’s more or less it. You can do quite a bit with LaTeX (there are numerous independently developed packages that extend its capabilities even beyond what the base system can do), but, for many researchers, you can learn almost all you’ll need to know to use it with just a few days of running through tutorials and/or hunting down the tags that will allow you to create whatever you need, e.g. footnotes, tables, citations, perhaps syntax trees for linguists, etc. There are many offline editors that allow you to write and compile .tex files, but ShareLaTeX itself is an online editor, so you can avoid figuring out how to install LaTeX and an editor on your laptop by using the site. One added bonus, too, is that if the phrase “compiling a .tex file” sounds intimidating to you, ShareLaTeX simply gives you a “Compile” button that does it all for you and shows you the resulting .pdf document.

ShareLaTeX has many other bonuses, though, because it’s really a collaboration tool. You create a project and invite collaborators or advisors who then have real time access to anything that’s in that project and any changes that are being made. In my case, a project might include a .tex file, a .bib file2 containing my list of references (a standardized, human-readable text file format for automatically handling citations in .tex documents), .csv files for data (again, a standardized, human-readable text file format), and .R scripts that perform statistical analyses and produce figures and tables (which are again non-proprietary, human-readable text files). Collaborators can comment on the text, check out the data that text was based on, and see exactly how you analyzed it, all in one place. ShareLaTeX even has a form of version control so that you can get back to an earlier draft of your paper if necessary and collaborators can see how each paragraph has been changed. It’s basically like a super-powered Google Docs and ultimately far more efficient than trying to create your own version control system out of a bunch of Word documents that you then have to e-mail back and forth to each collaborator separately.

Another big advantage of writing your papers in LaTeX is that you can add R code directly into a LaTeX document3 via an R package called knitr. What this means is that when your analyses change or your data changes, your write-up will automatically be updated, as well. No longer do you need to tediously figure out how you generated a figure or a number, go back to another program, change your analysis, regenerate the figure or number, create an image or something of that sort, switch the old image with the new one in your document, and then hope that you didn’t miss a reference to it somewhere else in your paper. Instead, find the relevent paragraph in your .tex file, change a number or whatever, press compile, and you’re done. Remember, work smarter, not harder.

What this also means is that you don’t want to just learn LaTeX, you want to learn R, too. R is another programming language that’s specifically designed for doing statistics. There’s more of a learning curve for R than for LaTeX, but R is also extremely popular. There’s a very good chance that your university offers seminars for learning it or that you have a colleague who knows it already. In the unlikely chance that neither of these things are true, there are also a huge number of online tutorials and free courses for learning it.4 As with LaTeX, R can do far more than what you need it to do for you, so the trick is to learn some basics and then focus on learning what you need for whatever project you’re doing. In my case, I taught myself enough R to analyze all the data for my thesis in about two weeks.

So ShareLaTeX provides an online environment to store your write-up, your references, your data, and your analyses. It also provides exceptional tools for collaborating. What it’s not so great for is sharing your work with the world. I don’t just mean publishing a paper, but making your data and analyses available to the public and other researchers for free, i.e. partaking in open science. To accomplish this, we need a second component for our workflow, a website called GitHub. GitHub technically exists for developing open source software, so why use it for science? Because this is the future, and in the future we do open science, and we write papers as if we’re writing software.

Another reason is because we can automatically sync our projects to GitHub from ShareLaTeX, and GitHub can then sync them to numerous other sites. You can can even sync a GitHub project with a local version on your laptop using Git, which is simply the version control software that GitHub itself uses. In this way, you can work without internet access but still maintain a consistent system of drafts and rewrites that don’t get confusing. In fact, that’s really the whole purpose of Git and GitHub. They keep track of changes to text files (e.g. .tex files or .bib files or .csv files or .R script files) so that you don’t have to. This combined with GitHub’s popularity make it the perfect tool to act as the hub for your workflow.

But GitHub is also very social. Once you have a project on the site, anyone can make their own copy of it and do what they want with it. Perhaps they had a better idea for how you could have done a statistical analysis: they can literally redo it themselves and then merge their changes back into your project, with your approval of course. Perhaps they want to replicate your study or apply new analyses to your data: this is all perfectly simple once the project is on GitHub. This is how open source software development has worked for a while, and scientific research can just as easily benefit from this sort of workflow and openness.

Still, GitHub is not a science-centric site. This means that it’s missing two important elements: 1) it doesn’t make your project super-visible to other researchers, and 2) it doesn’t facilitate open science processes like preregistrations and publishing preprints.5 Luckily, GitHub allows you to sync your project to the Open Science Framework (OSF), the third component of our workflow, which can handle all of the above. The OSF is not so great for collaboration, even though it has a rudimentary commenting system and a rudimentary version control system, but it’s an ideal place to create preregistrations, increasing validity and transparency for your research, which can then all be linked to preprint versions of your research that can be uploaded to preprint servers that the OSF also hosts, like arXiv or, if you’re in the social sciences like me, SocArXiv. In fact, ShareLaTeX, once merged with Overleaf, will most likely support direct submission to preprint servers, which includes formatting and all, since Overleaf currently has this feature.

So, to summarize, the workflow described here has four components:

  1. ShareLaTeX (your main work area for you and your named collaborators)
  2. GitHub (a central hub that makes your work public and allows for anonymous collaboration)
  3. The OSF (fulfills all your open science and study design validity needs)
  4. Git (your offline tool for working without internet access)

Using these tools involves more of a learning curve than the old-fashioned Word + e-mail methods (you’ll need to learn some LaTeX, some R, and how Git and GitHub work, though these latter two can really be learned in just a couple hours), but once you get over that curve, your life will be significantly easier. You’ll be able to spend your time thinking about your research instead of thinking about how to manage your research and how to keep everyone in the loop. This is the essence of working smarter, not harder: if you put a little more effort in on the front end, you’ll catch up and get far ahead on the back end.


  1. At the time of this writing, ShareLaTeX is in the processing of merging with a similar site called Overleaf, which will eventually yield Overleaf v2. I’ve been beta testing v2, however, and it appears, thankfully, to be almost identical to ShareLaTeX.
  2. This is a really important concept for working smarter, not harder, as well: do not ever deal with references and citations by hand. Personally, I recommend using Zotero for reference management. Zotero allows you to import sources, complete with .pdf’s, with the push of a button in your web browser, and it syncs these sources to its website as well as any other computers that you’re running the application on. You can then create .bib files directly from Zotero, which allows you to create in-text citations and automatically generate bibliographies in your .tex LaTeX document.
  3. On ShareLaTeX, you rename your file from .tex to .Rtex to accomplish this. Otherwise, nothing changes.
  4. To maybe a lesser extent, this is all true of LaTeX, as well. LaTeX has been around for decades and has been the method of choice for writing papers in many math-heavy fields.
  5. I won’t get into the benefits of using registrations and preprints here, as I’m just trying to outline an effective workflow, but I highly recommend looking into them.

Linguistics as engineering.

I’ve never liked Chomsky, despite never reading anything by him. His ideas are so prevalent in linguistics, at least in American universities, that you don’t really have to read his work to be exposed to his ideas. However, it’s important to me to have a good idea of the context within which ideas have been proposed and developed, so I finally read Syntactic Structures (Chomsky, 1957/2002), which I think encapsulates everything I dislike about Chomsky and the sort of theoretical linguistics that his ideas have led to.

First of all, though, let me say that I do not think Syntactic Structures is a worthless book. Even though I disagree with much of what Chomsky wrote, he did pose some interesting questions, and even that alone gives it value. For instance, Chomsky argued that grammars should be developed using nothing but formal means, disregarding semantics completely (pp. 93-94). There are several reasons why I don’t think this is correct, which I won’t get into here, as my point is simply that this is an interesting question to consider.

What I don’t like about Chomsky and the sort of theoretical linguistics that he spawned is the near complete disregard for empirical evidence for anything. Theoretical linguistics has relied almost entirely on intuitions for its “data”, often the intuitions of linguists themselves, not of informants. Despite Syntactic Structures often being credited as a foundational work for cognitive science, it never once suggests that linguists use things like experimentation to validate their theories as those in other scientific fields dealing with cognition would do, such as psychologists and neuroscientists.

There are two things in Syntactic Structures that I think have given linguists cover to approach their “science” this way:

  1. Chomsky argued that grammars have nothing to do with synthesis or analysis (p. 48)
  2. Chomsky argued that the goal of linguist theory is to develop an evaluation procedure (pp. 50-52)

By synthesis and analysis, Chomsky meant how humans produce language and how they understand language, respectively. He didn’t think that grammars address these questions, which is patently bizarre. What exactly do grammars describe if not one or both of these things? It seems that one is instead engineering how a grammar could work for some imagined artificial being, in which case we don’t need to consider empirical evidence generated by observing or experimenting on real human beings.

As for the evaluation procedure, Chomsky meant that developing a linguistic theory that could tell us if a given grammar is the correct grammar for a given language is too hard, and developing a linguistic theory that could generate a grammar from a corpus is even harder, so we’re better off developing a linguistic theory that simply tells us if one grammar is better than another for describing a given language. And what is the criteria? Simplicity.

The problem with focusing on an evaluation procedure, though, is that this downplays the importance of empirical evidence once again. There’s no need to test human beings to figure out if they employ transformations, for instance; we just need to show that transformations simplify the grammar more than some other proposal would, that other proposal also having been developed without any regard for testing if it actually represents what happens in the heads of human beings.

Ultimately, the direction that Chomsky set out for linguistics in Syntactic Structures seemed to be about how best to engineer an efficient grammar, not how to understand how humans do language. If Chomskyan linguistics actually does explain what humans do, that result is purely accidental, as there’s nothing about how its done that would be able to establish that connection.

Unsurprisingly, what the results of Chomsky’s approach to linguistics seem most useful for is developing speech synthesis and speech analysis software, i.e. engineering. There’s no need for AIs to do language in the same way that humans do language; they simply have to work. And I’m very much happy that they do. I use Google Assistant all the time, and I can’t wait to be able to speak to my house like the crew of the USS Enterprise speaks to their spaceship.

However, as far as advancing linguistics as a science, I think Chomsky’s approach, as set out in Syntactic Structures, has led to a monumental waste of time and resources. Numerous very intelligent and creative linguists have now spent some 60 years essentially playing a puzzle game that has not shed any light whatsoever on how exactly humans do language, and I don’t think it’s going too far to say that Chomsky’s ideas, combined with his enormous influence in the field, are to blame.


Chomsky, N. (2002). Syntactic Structures (2nd ed.). Berlin; New York: Mouton de Gruyter. (Original work published 1957)

How conflating terminology helps racists validate their racism.

Somewhat related to a recent post of mine, I came across this troubling article in the NY Times by David Reich, a Harvard geneticist who seems to regularly be described as “eminent”, in which he argues that “it is simply no longer possible to ignore average genetic differences among ‘races.'” He seems to have positive intentions — he even begins the article by acknowledging that race is a social construct — and I have no doubt that his knowledge of genetics is lightyears beyond my own non-existent knowledge of that subject, but despite his intentions and knowledge in that field, he seems to not have consulted with social scientists at all. The crux of the issue is that he conflates “race” with “population”. Indeed, immediately after acknowledging that race is a social construct, he states the following:

The orthodoxy goes further, holding that we should be anxious about any research into genetic differences among populations.

He seems to be using the two terms as synonyms, or at the very least, he’s being careless enough with his use of the two that it appears that he’s using them as synonyms. I seriously doubt that there are any respected geneticists who would argue that genetic differences among populations do not exist, but that’s not at all the same as making an argument about whether genetic differences between races exist.

There are already two good responses to the article, one in BuzzFeed, co-signed by some 67 scientists, and another by sociologist Ann Morning, who also co-signed the BuzzFeed article. These do a pretty good job of explaining the problem with Reich’s article — although I think the BuzzFeed article would have been better if they had not attempted to comment on genetic findings as much — so I just want to talk about Reich’s example from his own research supposedly showing how race can be used productively to study genetics. Here’s the relevant quote from his article:

To get a sense of what modern genetic research into average biological differences across populations looks like, consider an example from my own work. Beginning around 2003, I began exploring whether the population mixture that has occurred in the last few hundred years in the Americas could be leveraged to find risk factors for prostate cancer, a disease that occurs 1.7 times more often in self-identified African-Americans than in self-identified European-Americans. This disparity had not been possible to explain based on dietary and environmental differences, suggesting that genetic factors might play a role.

Self-identified African-Americans turn out to derive, on average, about 80 percent of their genetic ancestry from enslaved Africans brought to America between the 16th and 19th centuries. My colleagues and I searched, in 1,597 African-American men with prostate cancer, for locations in the genome where the fraction of genes contributed by West African ancestors was larger than it was elsewhere in the genome. In 2006, we found exactly what we were looking for: a location in the genome with about 2.8 percent more African ancestry than the average.

When we looked in more detail, we found that this region contained at least seven independent risk factors for prostate cancer, all more common in West Africans. Our findings could fully account for the higher rate of prostate cancer in African-Americans than in European-Americans. We could conclude this because African-Americans who happen to have entirely European ancestry in this small section of their genomes had about the same risk for prostate cancer as random Europeans.

Reich offers this as an example of how using race as a variable can be fruitful, but I think what he really does is undermine his own argument. What he’s ultimately talking about here is not African-Americans, but people with a section of their genome matching that which was commonly found in people who lived in West Africa. This appears to be the population that’s relevant to his study, yet he insists on talking about his results in terms of a race instead, repeatedly referring to African-Americans, a culturally diverse group that’s too often treated as monolithic and who don’t even necessarily have this ancestry, a fact that Reich admits in this very passage.

The use of the label African-American in his explanation serves no explanatory purpose and in fact is not even very precise. What it does do is make it easy for racists to claim that some Harvard geneticist has validated their racism, and confuse laymen who aren’t versed in subtle terminology distinctions for referring to groups of people, which Reich himself doesn’t even seem to be versed in. He repeatedly describes these subjects as “self-identified”, which I assume he does in order to take responsibility for using the label out of his own hands, but as I explained in my previous post, this strategy offers no protection at all for people who would be hurt by the stereotypes that are generated when using social variables like race.

Indeed, my admittedly unscientific survey of Twitter has led me to what appear to be three types of reactions to the piece: 1) social scientists pointing out how irresponsible the article is, 2) geneticists mocking “soft scientists” and/or praising the article as a fantastically delicate treatment of a difficult topic, and 3) blatant, hardcore racists using the article as validation for their racism. (3) should be troubling enough to those in (2) to convince them to go talk to those in (1) about how to better deal with the social side of their research.

Interpreting uninterpretable P-values.

Lately, I’ve been trying to learn more about open science and how it relates to research I’ve done, research I’d like to do, and how it relates to sociolinguistics in general. One topic that comes up regularly when talking about open science is pre-registration. For those who aren’t familiar with this process, pre-registration refers to publishing a detailed, time-stamped description of your research methods and analyses on some repository before ever actually looking at your data. Doing so increases transparency for the research and helps the researcher avoid P-hacking, aka data fishing1. There are apparently some arguments against pre-registering research, but I’ve yet to see any that don’t mischaracterize what pre-registration actually is, so it seems like a no brainer to do it.

But in looking into the actual mechanics behind producing a pre-registration, I ended up watching the following webinar from the Center for Open Science (COS) about using their Open Science Framework (OSF) to publish pre-registrations, which included this curious description of how to interpret P-values in different kinds of research2:

Basically, the claim is that pre-registration makes it clear which analyses are confirmatory3 and which are exploratory, which is great, but the other part of the claim is that P-values are uninterpretable in exploratory research. In other words, any P-values that are generated through analyses that weren’t pre-registered, i.e. through data fishing, are meaningless.

I can understand why this point is made, but I think it’s a bad point. Pre-registration does seem to create another level in the hierarchy of types of research — i.e. exploratory (observational, not pre-registered) > confirmatory (observational, pre-registered) > causal (experimental) — but I see no reason why P-values are uninterpretable at the exploratory level. It would seem that P-values are perfectly valid at all levels, and all that changes is how they should be interpreted, not whether they can be interpreted at all. To me, in experimental research, a P-value helps one argue for a causal relationship, whereas in confirmatory observational studies, a P-value helps one argue that some relationship exists, though not necessarily a causal one, and in exploratory observational research, a P-value simply suggests that there might be a relationship and so that potential relationship should be explored further in future research.

In the case of my thesis, I did employ P-values via Fisher’s exact test of independence, but I didn’t pre-register my analyses. That’s not to say that all my analyses were exploratory, just that I have no proof that I wasn’t data fishing. Indeed, I included variables that didn’t make any sense to include at all4, but still somehow turned out to be statistically significant, such as whether there was a relationship between the person who coded each token of my linguistic variable, (lol), and how that variable was realized. The webinar initially made me panic a bit, asking myself if it was irresponsible to have included P-values in my analyses, but after further reflection, I think it was completely justified. Most of my analyses were confirmatory anyway, even though I don’t have proof of that, and those that were arguably exploratory were still more useful to report with P-values as long as an explanation for how to interpret those P-values was also included, which is perhaps the one place where I could’ve done better.

Ultimately, while I can understand why there’s so much focus on data fishing as a negative thing, I think it’s important to not overshoot the mark. P-values can certainly be misused, but that misuse seems to come down to not providing enough information to allow the reader to properly interpret them, not to whether they were included when they shouldn’t have been.


1. I prefer the term data fishing, which can be more easily taken in both a negative and a positive way, whereas P-hacking sounds like it’s always negative to me. The Wikipedia article on data fishing gives a pretty clear explanation of what it is, for those who are unaware.
2. The webinar is really good, actually. I would suggest that anyone who’s new to open science watch the whole thing.
3. In this case, the speaker seems to be using the term “confirmatory research” as something different from “causal research”, otherwise their description doesn’t make any sense.
4. In fact, my thesis advisor didn’t see the point in me including these variables at all.

The importance of anonymizing groups under study.

It’s been a long time since I’ve written a post here, but I promise, there’s a good reason: I was finishing up my master’s thesis. However, now that it’s submitted, I can talk a bit about what I did.1

Because I made use of social network analysis to detect communities in the study, there was little motivation to class subjects by social variables like ethnic group, race, religion, etc. In fact, I wouldn’t have been able to do so if I wanted to, because I assembled the corpus from tweets sent by some 200k people. Ultimately, the only variable that I can call a social variable that I used was the number for the community to which the subject belonged.

The advantage of this situation is that I completed avoided imposing stereotypes on the subjects or minimizing the differences between their identities by avoiding classifying them with people from elsewhere. A typical example of the problem in sociolinguistics is the variable of race. Some celebrated studies, like Labov’s (1966) and Wolfram’s (1969), classified their subjects according to their races, so that one ends up identifying some as African-American, for example. Even if these subjects don’t live together nor interact, they inevitably end up being viewed as constituting a single group. From there, these groups’ diverse identities are minimized.

This problem has already been recognized in sociolinguistics, and several solutions have been proposed, mainly the implementation of the concept of communities of practice and more reliance on self-identification. For example, in Bucholtz’ (1999) study, she studied a group whose members she identified according to an activity: being a member of a club. Unfortunately, she applied a label to the member of this club; she called them “nerds”. This name links them to nerds from elsewhere, regardless of the differences between this group and other groups of nerds. She wasn’t able to avoid minimizing the identity of the group that she studied by the simple implementation of the concept of communities of practice. Likewise, Eckert (2000) relied on self-identification of her subjects as either “jock” or “burnout”, but one ends up with the same problem: even if the subjects self-identify, they can choose labels that link them to distant groups. Jocks surely exist elsewhere, but these others jocks can be exceptionally different from the jocks in Eckert’s study. So, one cannot avoid minimizing identities by the simple reliance on self-identification, either.

In my thesis, I identified communities simply with ID numbers, so I never classified the subjects with other groups to which they didn’t belong. The fact that I used social network analysis to automatically detect these communities allowed me to more easily avoid applying labels to the subjects that could minimize their identities, but this is possible in any study, even if the researcher employs classic social variables. In the same way that one anonymizes the identities of individuals, one can anonymize the identities of the groups under study. Why is it necessary to know that the races in a study are “black” and “white or that the religions are “Jewish” and “Catholic”? If a researcher is interested in the way that their subjects navigate stereotypes that are relevant to their lives, that’s one thing, but most variationist studies don’t take up this question, so most studies can do more to protect marginalized people.


1. For those who don’t know the topic of my thesis, I analyzed the use of the linguistic variable (lol), made up of lol, mdr, etc., on Twitter.


Bucholtz, M. (1999). “Why Be Normal?”: Language and Identity Practices in a Community of Nerd Girls. Language in Society, 28(2), 203–223. https://doi.org/10.1017/s0047404599002043

Eckert, P. (2000). Linguistic Variation as Social Practice: The Linguistic Construction of Identity in Belten High. Madlen, MA: Blackwell Publishers, Inc.

Labov, W. (2006). The Social Stratification of English in New York City (2nd ed.). Cambridge, England: Cambridge University Press. (Originally published in 1966)

Wolfram, W. (1969). A sociolinguistic description of Detroit negro speech. Washington, D.C: Center for Applied Linguistics.

Pluarlistic globalism and endangered languages.

I finally finished watching this over breakfast this morning. Something interesting from a linguistic perspective is that they don’t seem to use any English words in their Cherokee despite heavy contact, perhaps because they go to lengths to create new words for new things (see 35:00). This is not the strategy taken elsewhere, such as in Louisiana and the Maritimes (although Quebec tries to do this at least officially).

First Language, The Race to Save Cherokee by Neal Hutcheson on Vimeo.

Also, I think the quote at the end is particular fitting given the current social and political climate throughout the West. He positions the idea of a strong local culture within a broader context that doesn’t necessarily need to reject larger over-arching cultures or even global interconnectedness:

“If we consider what it actually means to be a pluralistic society, then that means we’re gonna have to make space for people who speak different languages, who think different ways, who have different cultures, inside of a national culture or a global culture, and so all the movement has been in the opposite direction towards globalization, towards homogenization, you know? What does it mean to change the process and open up space for a plurality of different small cultures working together? How can we truly accept and respect those people and allow them some measure of autonomy with their educational system and the language that they speak?” –Hartwelll Francis de West Carolina University, ma traduction

An interesting cup of coffee.

I’m transcribing some broadcasts from Louisiana in French for a class on language change. For the recents broadcasts, I chose the show La Tasse de café on KVPI, and for the old broadcasts, the series En français, which was broadcast by Louisiana Public Broadcasting, a public TV station, in the 80s and 90s. I’m analyzing the variation between third person plural subject pronouns, meaning ils, ils -ont, ça, eux and eux-autres, but something that I immediately noticed in relation to the speech of Ms. Ledet, who was born in 1919, is that she employs many constructions that make her speech sound like that of the French in formal contexts. You don’t hear these constructions in the speech of Mr. Soileau and Mr. Manuel on KVPI (the former being born in 1941, the latter, I don’t know):

Ms. Ledet on En français

It’s not clear if this stems from a difference in region, in age, in interlocutor (the interviewer on En français seems rather France French), in interaction with francophones from elsewhere, or something else, but it’s interesting nonetheless. The corpus I’m constructing is small, because it’s just for a term paper, but I intend to extend it and possible perform other analyses.

« Older posts

© 2025 Josh McNeill

Theme by Anders NorenUp ↑