The Open Virtual Assistant Lab seminar is a biweekly event (every two weeks) on Monday where students and researchers present their work in the area of virtual assistants and linguistic user interfaces.

We're running this seminar for the first time in the Summer Quarter of 2019, and we welcome everybody to attend. The seminar is open to the Stanford community. If you would like to give a talk, please contact .

Mailing list: oval-seminar@lists.stanford.edu

Jump to the next talk

6/17: Genie and Friends: Building Linguistic User Interfaces Easily and Cheaply

Time:

Location: Gates 104

Abstract:
Linguistic user interfaces will soon become ubiquitous as the new interface to digital services, web and IoT devices. At the same time, building a new LUI requires significant resources, both in terms of annotated data and machine learning expertise.
In this talk, I will present some of the tools that attack this problem, and help domain-specific experts to bootstrap new LUIs in their domain. I will present Genie, a tool to build new compositional language capabilities, using a direct translation from natural language to executable code. I will also present work in progress in the area of multi-turn, conversational user interfaces. The new work is also based on a translation from language to code, and it is showing promising results.

Speaker: Giovanni Campagna

7/1: Open-Vocabulary Reinforcement Learning through Human Interaction

Time:

Location: Gates 104

Abstract:
Machine learning (ML) systems have begun to play a larger role in real-world deployments, increasing their interactions with people. In many of these domains, for example in social media, the most available and expressive channel for interaction is through text conversation. However, given substantial limits in what these ML-based conversational systems can actually support, substantial requirements have been placed on people to learn stilted forms of conversational interaction with these systems. We believe that the burden of learning to interact with automated systems should lie on the agent and not the users. Language-based reinforcement learning models, which can learn through human interactions, are currently intractable to train because the action space is exponentially large in the vocabulary size and rewards from human feedback are sparse. We introduce a latent discretization of the action space and a variational autoencoder-based training method that together retain language expressivity while reducing the action space to learnable levels. We introduce ELIA, a conversational agent that improves its knowledge of the visual world by learning to ask people questions about the photos they upload on social media. Through a reinforcement learning framework, ELIA continuously learns from social media responses (or lack thereof) to ask more engaging questions in a completely open-vocabulary setting. We deployed ELIA on Instagram for 8 months, where it has received 130,000 responses, learned to ask relevant questions and extract structured responses, and used those responses to train a state-of-the-art social media visual question answering model.

Speaker: Ranjay Krishna
Ranjay Krishna is a Ph.D. Candidate in the Artificial Intelligence Lab at Stanford University, where he is co-advised by Professors Fei-Fei Li and Michael Bernstein. His research interests lie at the intersection of computer vision, machine learning, natural language processing, and human-computer interaction. He leads a research group that is developing never-ending learning visual systems that can organically grow knowledge by interacting with people through conversations. He is also a teaching fellow at Stanford, where he designed and teaches a course on computer vision. He completed his B.S. in electrical and computer engineering at Cornell University and M.S. in computer science at Stanford University.

7/15: Practice Special

Time:

Location: Gates 104

Multi-modal Mobile Assistant

Abstract: Current interactions on mobile devices are limited in context switching and interaction history references. We would like to tackle this problem with a Mobile Assistant that uses multiple modalities for interaction. It allows users to use voice command combined with direct manipulation to easily query and execute command according to both existing virtual assistant vocabulary and collected user interaction history.

Speaker: Jackie Yang

7/29: Effective Self-Attention Networks for Sequence Learning

Time:

Location: Gates 104

Abstract:
Self-attention networks (SAN) have shown promising empirical results in a variety of sequence learning tasks for natural language processing, such as machine translation (i.e, Transformer) and language representations (i.e., BERT). In this talk, I will first give a brief introduction about sequence learning and self-attention mechanisms. Then, I will present two of our recent work to improve SAN, including a disagreement regularization which explicitly encourages SAN to capture distinct features with different attention heads, and incorporating context information into SAN to improve its dependency modeling ability. We conduct experiments on machine translation task with the Transformer framework. Experimental results demonstrate the effectiveness and universality of our proposed approaches.

Speaker: Jian Li
Jian Li is a fourth-year Ph.D. student at The Chinese University of Hong Kong. Currently, he is a visiting student at Prof. Monica Lam's group and working on natural language programming. His research interests include deep learning applied to natural language processing as well as programming language processing.

8/12: QuizBot: A Dialogue-based Adaptive Learning System for Factual Knowledge

Time:

Location: Gates 104

Abstract:
Advances in conversational AI have the potential to enable more engaging and effective ways to teach factual knowledge. To investigate this hypothesis, we created QuizBot, a dialogue-based agent that helps students learn factual knowledge in science, safety, and English vocabulary. We evaluated QuizBot with 76 students through two within-subject studies against a flashcard app, the traditional medium for learning factual knowledge. Though both systems used the same algorithm for sequencing materials, QuizBot led to students recognizing (and recalling) over 20% more correct answers than when students used the flashcard app. Using a conversational agent is more time consuming to practice with; but in a second study, of their own volition, students spent 2.6x more time learning with QuizBot than with flashcards and reported preferring it strongly for casual learning. Our results in this second study showed QuizBot yielded improved learning gains over flashcards on recall. These results suggest that educational chatbot systems may have beneficial use, particularly for learning outside of traditional settings.

Speaker: Sherry Ruan

Sherry Ruan is a CS PhD student at Stanford University advised by Professor James Landay. Her research interests lie in intelligent tutoring systems, human-computer interaction, and human-centered AI.

8/26: TBA

Time:

Location: Gates 104

Abstract: TBA

Speaker: Abigail See

9/9: TBA

Time:

Location: Gates 104

Abstract: TBA

Speaker: TBA