I’m a bit of a stickler about keeping English and Spanish separate. It’s not because of some purist streak, but because many people don’t speak both languages. It is only polite to stick to the one they have fully mastered.

In tech, this is virtually impossible. English is inevitable, even if we don’t consider the cringeworthy use of English-language terminology by Latin American startup founders and investors — usually to prove how clever they are, even as they misuse the words they’ve chosen. A particular pet peeve of mine is how certain sectors of Mexican society have opted to abbreviate “random” as “randy,” clearly not understanding that the word already has a very different definition in English. 

Some terms like “fake news” are used widely by government officials and the media, even though there are perfectly good equivalents in Spanish. Other words feel cringier when used in their “official” translations — influenciador makes me squirm, especially given that most Spanish speakers under the age of 60 understand the word influencer … or is it ínfluencer with an accent?

We’ve hit upon our first big conundrum: Should Spanish speakers pronounce a word phonetically, based on its spelling, or should they follow an English speaker’s pronunciation? Ask anyone to say the word “Wi-Fi” in Spanish and you’ll quickly realize that the jury is still out. These distinctions are often dictated by where any given Spanish speaker is from. Spanish Spanish (aka European Spanish) is particularly good at going all in on local pronunciation. My personal favorite is definitely “aheer benebeh,” (as read in English) to pronounce Airbnb.

The problem is that tech’s overreliance on English can actually do some damage. At a conference organized by Tec-Check and Poder del Consumidor, both consumer watchdog nonprofits, in Mexico City last week, we were told about how English is used to skirt Mexican law. Mexico requires any advertisement on social media to be clearly labeled as such. Many influencers will fulfill the letter of the law by slapping the hashtag #ad onto a commercial — something no Spanish-speaking child or adult would be expected to understand the meaning of. 

Dropping English words into Spanish has another knock-on effect: confusing AI language processors and training datasets. Spanish already trails far behind English when it comes to training AI, and experts tell me the slower pace of development might be due to the Spanish datasets being “contaminated” by English words and phrases. This might help explain the pretty terrible glitches in TikTok’s Latin American Spanish text-to-voice AI, where simple words like leer (to read) are pronounced as if they were being read in English.

Spanish as a language is already relatively underrepresented on the internet for it to afford another setback as AI becomes a larger part of our digital lives. For instance, searches in Spanish for medical information online provide poorer resources than if the same search were to be conducted in English. Add to this the problem of linguistically “corrupted” AI assistants reading out medical advice inaccurately and whole populations are suddenly placed in jeopardy. 

That is not to say Spanglish must therefore be scrubbed from our datasets altogether to avoid corrupting strictly segregated versions of Spanish and English. On the contrary, Spanglish and the deep interconnections between Latin America and the United States need to be further understood and considered as we move ahead in the development of new technologies.