Language, Digital Text & AI

Modern text and AI frameworks.

Bryan Rhoads
6 min readOct 13, 2020
“breathe” Hawthorne Blvd Portland, Ore

I was not a solid “A” student in high school English class — yes… it’s true. Nevertheless, I’ve been a fan and a natural student of my native tongue since about that time. Probably stemming from my father’s interest and also the 1980’s PBS series “The Story of English

TLDR;

Word adoption, new words and their meanings shift overtime

Lexical diversity are signals of significance to persons, groups and societies

Never-ending textual data is created daily

AI methodologies offer frameworks for unstructured data interpretation

— its available on Youtube too — highly recommended. Not to mention Bill Bryson’s pleasing Mother Tongue and a few others. There was also the 3-week trip to England and Ireland at 17 when my Dad was conducting research for an upcoming book.

the author c 1989 — not taking English class seriously

Language and word choice for our purposes here… can be a fascinating, intriguing, striking yet awkward experience both to learn and to spell. Spellings, pronunciations and homonyms make English a difficult tongue to master. Harder still for the non-native speaker.

Yet today, English is arguably the global lingua franca of culture, business, the Internet, etc. Offering up a diverse lexicon of synonyms, antonyms, slang, and lingo to historical pidgins and even secret guild or trade languages from the European middles ages.

Touché or maybe e.g. (exempli gratia from Latin) :

  • Fascinating or fascinate — from Latin
  • Intriguing or intrigue — from French (late 17th century, not 1066)
  • Striking or strike — from German
  • Awkward — from Old Norse or Old Scandinavian

English is not so much a language as it is a history of peoples, migrations, relationships, wars, religion, science, culture and trade.

A diverse lingua franca

A growing amount of research on how word choice can reflect the thoughts, attitudes, and culture of the people who use and consume it. We also know that words, phrases and their meanings shift and change overtime. Shifting word meanings, plus the adoption of foreign or new words create rich vocabularies and lexicons.

This lexical diversity and richness provides signals into what’s significant to a person, group or even the whole of society. As writers of text messages or social posts and as readers of those messages — they can provide little windows of insight into the upstream author’s values and the mindsets of those downstream who consume it.

Ultimately those words moreoften become text. Text, including online reviews, social media posts, or a business’s own marketing efforts, provides data that can shed light on everything from end consumer needs to B2B sales insights and Account Based Marketing strategies.

A never-ending world of text

It’s estimated that 80 to 95% of all business communication involves text. After all, most efforts can be distilled down to a textural core. Videos have scripts, songs have lyrics; Podcasts and speeches can be transcribed.

Brands communicate with customers, customers communicate back to brands and among each other through social media or online reviews. Moreover, business communicates with investors and society communicates ideas and values (through traditional TV, film, music, etc). All of which generate huge amounts of text.

English is not so much a language as it is a history of peoples, migrations, relationships, wars, religion, science, culture and trade.

Of special note here, is the distinction between text producer and text receiver is an important one. Text reflects the text producer’s thoughts/attitudes upstream and then how text impacts the text receiver downstream.

Member of a tribe?

Exclusive language, or jargon, quickly identifies one as either an insider or an outsider. It’s also highly efficient, economical and even crucial in that it can capture distinctions not made by ordinary language. Ordinary language, then, is constantly stealing words and phrases from jargon.

Chaucer’s The Miller’s Tale c 1390 — Middle English

Every profession, trade and organization has its own bespoke jargon where specialized terms and shorthand among people between very niche to very large groups provides compressed cognition. Most jargon consists of unfamiliar phrases, abstract words, non-existent words and acronyms, abbreviations, plus the occasional euphemism.

Jargon first entered the English lexicon about 600 years ago when we borrowed it from French to mean “the twittering of a bird” — indicating the babble or the “blah blah blah” as we might say today.

We wouldn’t want to get rid of jargon, in fact, just the contrary as it enriches our personal lexicon — Read How accessible is your personal lexicon? via Leslie Bradshaw. Its jargon, including its context and use that I find so enlightening.

Artificial Intelligence (AI) and Human Intelligence (HI)

Natural language processing and text analytics are both nascent techniques that offer tremendous promise towards more advanced tools for learning, understanding and teaching.

Almost all of the recent advances in AI and its capability to distill unstructured textual data were developed in fields outside of marketing. Yet modern marketers are well positioned at the intersection of audiences and brands to leverage and advance tools that make better sense of textual information that addresses key business challenges.

AI tools that aid our Human Intelligence, not the AI that replaces the marketing sales professional, but a better brush combining art and science.

So much work to be done

Much bigger societal woes, such as the proliferation of misinformation, privacy, hate speech to authenticity and identification are just a few of the areas holding promise with analyzing digitized text.

Timber’s Army c 2018

Plus so far, we’ve only talked about using AI for “understanding”, we haven’t touched on methodologies for “prediction”. Language and more specifically word choice reflects info about who the users are, providing additional insight into what they might do in the future. This will have to be for a longer discussion another day, but using AI for prediction may not be a near term thing — text often lacks the understanding of deeper relationships conveyed in language or using text alone can provide.

Text analysis and Natural Language Processing needs the addition of Machine Learning for more accurate predictions. Text alone is often missing key relationships between people and objects or attitudes towards both. I’ve heard this referred to as the “Bag of Words” problem i.e. those words and phrases need additional relationships that are not readily available in text by itself. This metadata will be crucial for teaching the machine.

We are adding Machine Learning now to our prototype.

What else does one do over Quarantine?

We started building a working prototype performing many of the processes I’ve just described. I put together a novel approach and methodology to leverage these frameworks, got feedback from some old professors, some colleagues, and putting together an impressive team, etc.

In my next post— I’ll be talking about the prototype and the methodology we have put in place to start to capitalize on textual data. It may or may not involve my cousin’s data center in Iowa.

What more? Go download and read Uniting the Tribes — submitted to the American Marketing Association (AMA Aug. 2019) by professors Jonah Berger, Ashlee Humpreeys et al (et alia, more Latin) from Wharton, Northwestern, etc. (et cetera from Latin — but you knew that)

Next Up -> My Cousin’s Data Center in Iowa.

@bryanrhoads

--

--

Bryan Rhoads

// Transformative Optics / former @Micron @Intel @Creatorsproject @mitsloan / prof of digital business / co-creator of my children //