This week: Julia Hirschberg and Trace Foundation lectures

This week is just packed with linguistics goodness: Julia Hirschberg will be speaking this Friday, 3/26, and there is a series of lectures at the Trace Foundation downtown, starting on Friday and continuing on Saturday.

First, Professor Hirschberg’s lecture:

Friday, 3/26
Hamilton 709

Knowing When to Speak: Turn Management in Spoken Dialogue Systems

Julia Hirschberg
Department of Computer Science
Columbia University

Listeners have many options in dialogue: They may interrupt the current speaker, take the turn after the speaker has finished, remain silent and wait for the speaker to continue, or backchannel, to indicate that they are still listening, while not taking the turn. Previous studies have proposed a number of possible cues that may signal to listeners that a speaker is ready to relinquish the turn or, conversely, that a speaker intends to continue to hold the floor. I will describe results of empirical studies testing some of these proposals and investigating other correlates of turn-taking behaviors, in the context of a larger study of human-human turn-taking behavior in the Columbia Games Corpus. Our goal is to discover what types of human turn-taking behavior can most usefully be modeled in Spoken Dialogue Systems, both from the perspective of recognizing the import of users’ behavior and of generating appropriate system behavior. This is joint work with Agustín Gravano (University of Buenos Aires). We also thank our collaborators, Stefan Benus, Gregory Ward, Elisa Sneed, Hector Chavez, and Michael Mulley for their help in collecting and annotating the CGC and for useful discussions.

The Trace Foundation lectures will also start that Friday (in the evening – you can make it to both!) and continue all day Saturday:

Friday, 3/26 and Saturday 3/27
2 Perry Street, Suite 2B, New York, 10014 (map)

Minority Language in Today’s Global Society: Perspectives on Language Standardization

Language standardization is often looked to by language communities as a means for language maintenance and strengthening cultural integrity, yet it may also contribute to varying degrees of linguistic discrimination and social conflict. In the case of Tibetan language, which has a diversity of spoken dialects as well as a standard written language, new challenges and opportunities presented by urbanization, economic development, resettlement, and other factors present strong incentives to switch to other dominant languages in everyday usage. Thus many Tibetans support the idea of promoting a standardized Tibetan, but disagree as to what should be the basis for the standard.

In this lecture event, we will bring together scholars who have worked extensively on language standardization issues for Kurdish, Hungarian, Tibeto-Burman languages, and the three major dialects of Tibetan to examine questions such as: What should be the role of a standard language? What are its pros and cons? What are the experiences of other language communities in implementing standardization? We hope to understand these topics for minority languages in the world in general, the Tibetan language in China in particular, and what practical steps can be taken.

IMPORTANT: Please register for this event by downloading and completing the registration form and email to or print the completed form and fax to +1 212-367-7380.



5:30 – 6:00 pm: Check in & Registration
6:00 – 7:00 pm: Opening Keynote Lecture, Q&A
7:00 – 8:00 pm: Reception


9:30 pm – 10:00 am: Check-in & Breakfast Reception
10:00 am – 12:00 pm: Morning Session
12:00 pm – 1:00 pm: Lunch Break
1:00 pm – 5:00 pm: Afternoon Session & Closing Keynote Lecture

For more information, including speaker biographies, visit the Trace Foundation website’s event page here.


Dear linguists,

Here’s wishing you a delightful spring break, wherever you’re going (or staying)! For your reading pleasure, here’s an overview of upcoming events:

  • March 26-27: The Trace Foundation: “Minority Languages in Today’s Global Society: Perspectives on Language Standardization.” The lecture will focus on Tibetan, Kurdish, and Hungarian.
  • March 26: Professor Julia Hirschberg, Computer Science at Columbia: “Knowing When to Speak: Turn Management in Spoken Dialogue Systems”
  • March 31: Professor Ann Seghas on Nicaraguan sign language:  “Social Scaffolding for Language Genesis: Why Nicaraguan Sign Language Emerged When, Where and How it Did”
  • April 1: Professor Robert Remez on voice recognition: “I would know that voice anywhere! The role of phonetic sensitivity in the perceptual identification of talkers.”
  • April 30: “Workshops on Meaning: Language and Socio-cultural Processes”, co-sponsored by the Columbia Linguistics Society, presents Dr. William Labov.

More details to come…

Upcoming Events

Friday, 11/13/09
7:30 pm
Dinner at the Columbia Cottage. Join us for food and conversation – all are welcome! RSVP on Facebook.

Friday, 11/20/09
3 pm (location TBA)
A computational linguistics/natural language processing presentation by Nizar Habash of the Center for Computational Learning Systems:

Automatic Diacritization of Arabic Text

Arabic is written without certain orthographic symbols, called diacritics, which represent among other things short vowels. The restoration of diacritics to written Arabic is an important processing step for several computational linguistic applications, including training language models for automatic speech recognition, text-to- speech generation, and so on. We present here a new diacritization system for written Arabic based on a new combination of known techniques: a lexical resource for morphological analysis, a multi-classifier tagger and a lexeme language model. This new diacritization system outperforms the best previously published results by reducing the word error rate to 14.9% and reducing the diacritic error rate to 4.8%. The presentation includes a detailed error analysis classifying the type of errors resolved by each of the different modules used.

Friday, 12/11/09
Time and location TBA
Peter Connor of Barnard College will give a lecture on translation. More details to come.

New course!

In the Fall 2009 semester, the Computer Science department will be offering a new course on speech processing.

CS 6998: Topics in Speech Processing: Computational Approaches to Emotional Speech

This course introduces students to research on emotional speech. We will explore state-of-the-art work on recognizing and producing classic emotions automatically. Emotions such as anger, happiness, sadness, and uncertainty are important to recognize in online dialogue systems and to produce in computer games. We will also study the recognition of other types of speaker state, including deceptive and charismatic speech, and the uses of acoustic and prosodic information in medical domains, for the diagnosis of mental and physical disabilities. Classes will be lecture and discussion with an emphasis on group participation. There are no prerequisites for the course.

Computer Science courses

For the more technologically inclined, these classes might be a lot of fun:

*COMS W4705x Natural Language Processing*
Lect: 3. 3 pts.

Prerequisites: COMS W3133, or W3134 , or W3139 , or the instructor’s permission. Computational approaches to natural language generation and understanding. Recommended preparation: some previous or concurrent exposure to AI or Machine Learning. Topics include information extraction, summarization, machine translation, dialogue systems, and emotional speech. Particular attention is given to robust techniques that can handle understanding and generation for the large amounts of text on the Web or in other large corpora. Programming exercises in several of these areas.

*COMS W4706y Spoken Language Processing*
Lect: 3. 3 pts.
Not offered in 2008-2009.
Prerequisites: Prerequisites: COMS W3133 , or W3134, or W3137, or W3139, or the instructor’s permission. Computational approaches to speech generation and understanding. Topics include speech recognition and understanding, speech analysis for computational linguistics research, and speech synthesis. Speech applications including dialogue systems, data mining, summarization, and translation. Exercises involve data analysis and building a small text-to-speech system.