The Open Virtual Assistant Lab seminar is a weekly event where students and researchers present their work in areas related to voice user interfaces, chatbots and virtual assistants. Topics include user interaction with natural language, chatbot-based applications, agent-to-agent distributed systems, question answering, natural language understanding and generation, and more.

The seminar is open to the Stanford community and members of the OVAL affiliate program. If you're interested to give a talk, please contact .

Mailing list: oval-seminar@lists.stanford.edu

Archive: Summer 2019, Fall 2019

Jump to the next meeting

1/10: Organizational Lunch

Time:

Location: Gates 463A (4th floor, B wing)

Organizational lunch. Come enjoy food and sign up to give a talk during the quarter.

1/17: Domain-Specific Question Answering for Conversational Systems

Time:

Location: Gates 463A (4th floor, B wing)

Abstract:
Open-domain question answering (QA) is the task of answering natural questions from a large collection of documents. The typical open-domain QA system starts with information retrieval to select a subset of documents from the corpus, which are then processed by a reading comprehension model to select the answer spans. The majority of prior research on this topic focuses on answering questions from Wikipedia. However, searching for answers in a broad range of specialized domains ranging from IT infrastructure to health sciences remains a challenging problem since many of these domains lack extensive labeled datasets for training. In addition, their design usually favors accuracy, and latency and throughput are not major concerns.
In this talk, I will describe an open-domain QA system consisting of a new multi-stage pipeline, which employs a traditional information retriever, neural relevance feedback, a neural ranker, and a reading comprehension stage. This system substantially outperforms the previous state of the art for question answering on Wikipedia/SQuAD, and can be easily tuned to meet various timing requirements. I will also discuss how synthesized in-domain data enables an effective domain adaptation for such systems.

Speaker: Sina Jandaghi Semnani
Sina Semnani is a PhD candidate in the Electrical Engineering department at Stanford University, and holds a BS degree in Computer Science. He has previously worked on using machine learning tools in computer network design. He is interested in building systems that can extract knowledge from large amounts of unstructured data available in various domains. Additionally, his research interests include conversational systems and data-efficient deep learning.

1/24: Using Synthetized Data To Train Multi-Domain Dialogue State Trackers

Time:

Location: Gates 463A (4th floor, B wing)

Abstract:
Multi-domain dialogue state tracking is the task of tracking the domain of a conversation, the intent of the user, and which information has been provided in a task oriented dialogue. Current state of the art uses Wizard of Oz techniques to acquire data in the new domains, which is expensive and prone to human error. We instead propose a zero-shot training strategy in which real world data from other domains is combined with synthesized data in the new domain to bootstrap a dialogue state tracker. Our technique uses an abstract model of task oriented dialogues, which can be instantiated in different domains by providing the domain-specific lexicon, to synthesize a large set of dialogues and their turn-by-turn annotation. In our experiment in the MultiWOZ benchmark dataset, we can achieve between 63% and 93% of the accuracy of real world data, at a fraction of the cost. Our experiments show that pretrained language models (BERT) complement the synthesized data. I will also present work in progress in applying the technique to the ThingTalk programming language, to produce full multi-domain conversational agents for task-oriented dialogues with minimal conversational design.

Speaker: Giovanni Campagna

1/31: Building Robust Natural Language Processing Systems

Time:

Location: Gates 463A (4th floor, B wing)

Abstract:
While modern NLP systems have achieved outstanding performance on static benchmarks, they often fail catastrophically when presented with inputs from different sources or inputs that have been adversarial perturbed. This lack of robustness exposes troubling gaps in current models’ understanding capabilities, and poses challenges for deployment of NLP systems in high-stakes situations. In this talk, I will demonstrate that building robust NLP systems requires reexamining all aspects of the current model building paradigm. First, I will show that adversarially constructed test data reveals vulnerabilities that are left unexposed by standard evaluation methods. Second, I will demonstrate that active learning, in which data is adaptively collected based on a model's current predictions, can significantly improve the ability of models to generalize robustly, compared to the use of static training datasets. Finally, I will show how to train NLP models to produce certificates of robustness---guarantees that for a given example and combinatorially large class of possible perturbations, no perturbation can cause a misclassification.

Speaker: Robin Jia
Robin Jia is a sixth-year PhD student advised by Percy Liang. His research interests lie broadly in building natural language processing systems that can generalize to unexpected test-time inputs. His work has received an Outstanding Paper Award at EMNLP 2017 and a Best Short Paper Award at ACL 2018

2/7: Machine Learning for Accelerating Scientific Discovery

Time:

Location: Gates 463A (4th floor, B wing)

Abstract:
For long, the process of scientific discovery has relied primarily on human ingenuity and passive observations of the world. The rise in sensory capabilities and computational power over the last few decades have however given way to a new paradigm of data-driven scientific discovery. In this talk, I’ll discuss some challenges and methods to integrate machine learning more closely with science and engineering. For instance, how can we incorporate domain knowledge within machine learning algorithms? How can we use machine learning to aid simulations and design of complex natural phenomena? How can we use machine learning to plan experiments under budget constraints? For all these questions, we’ll see efficient ways to resolve tensions between computation and statistics, and in the process, accelerate science and engineering in various domains.

Speaker: Aditya Grover
Aditya Grover is a 5th-year Ph.D. candidate in Computer Science at Stanford University advised by Stefano Ermon. His research focuses on advancing probabilistic machine learning algorithms for high-dimensional data and applications to accelerating science and engineering. He also co-created and teaches CS 236: Deep Generative Models at Stanford. Previously, Aditya obtained his bachelors in Computer Science and Engineering from IIT Delhi in 2015.

2/14: Answering Questions about Charts and Generating Visual Explanations

Time:

Location: Gates 463A (4th floor, B wing)

Abstract:
People often use charts to analyze data, answer questions and explain their answers to others. In a formative study, we find that such human-generated questions and explanations commonly refer to visual features of charts. Based on this study, we developed an automatic chart question answering pipeline that generates visual explanations describing how the answer was obtained. Our pipeline first extracts the data and visual encodings from an input Vega-Lite chart. Then, given a natural language question about the chart, it transforms references to visual attributes into references to the data. It next applies a state-of-the-art machine learning algorithm to answer the transformed question. Finally, it uses a template-based approach to explain in natural language how the answer is determined from the chart’s visual features. A user study finds that our pipeline-generated visual explanations significantly outperform in transparency and are comparable in usefulness and trust to human-generated explanations.

Speaker: Dae Hyun Kim
Dae Hyun is a Computer Science PhD student at Stanford, working with Prof. Maneesh Agrawala. His research focuses on building natural language interfaces for data visualizations. He did his undergrad in Computer Science at California Institute of Technology.

2/21: Distributed Perception and Learning Between Robots and the Cloud

Time:

Location: Gates 463A (4th floor, B wing)

Abstract:
Tomorrow’s robots will seldom be limited by their on-board compute and memory constraints. Instead, they will be able to gracefully use cloud computing services to query extensive map databases, run compute-and-power-intensive machine vision models, and even continually improve these models with collective field data. Indeed, the adoption of cloud-connected robots is increasingly important today, since current robots are struggling to process growing volumes of rich sensory data and scalably run compute-and-power-hungry perception models. While the benefits of cloud robotics have been envisioned long before, we still lack flexible methods to trade-off such benefits with end-to-end systems costs of network delay, cloud storage, human annotation time, and cloud-computing time. To address this problem, I will introduce decision-theoretic algorithms that allow robots to significantly transcend their on-board perception capabilities with cloud computing, but in a graceful, fault-tolerant manner. The utility of these algorithms will be demonstrated on months of field data and experiments on state-of-the-art embedded deep learning hardware, which provide key insights for future work on distributed control.

Speaker: Sandeep Chinchali
Sandeep Chinchali is a computer science PhD candidate at Stanford, advised by Sachin Katti and Marco Pavone. Previously, he was the first principal data scientist at Uhana, a Stanford startup working on data-driven optimization of cellular networks, now acquired by VMWare. His research on networked control has led to proof-of-concept trials with major cellular network operators and was a finalist for best student paper at Robotics: Science and Systems 2019. Prior to Stanford, he graduated from Caltech, where he worked on robotics at NASA’s Jet Propulsion Lab (JPL).

2/28: Controllable Text Generation

Time:

Location: Gates 463A (4th floor, B wing)

Abstract:
Large-scale language models show promising text generation capabilities, but we would like to increase control over particular aspects of the generated text. We recently released CTRL, a conditional Transformer language model for more controllable text generation. Control codes for CTRL were derived from structure that naturally co-occurs with raw text, preserving the advantages of unsupervised learning at scale. We will also discuss methods recently introduced by other groups as well as several current research directions we are exploring for controllable text generation. We have released multiple full-sized, pretrained versions of CTRL at this https URL.

Speaker: Bryan McCann
Bryan McCann is a Lead Research Scientist at Salesforce Research. He primarily works on Deep Learning and its applications to Natural Language Processing. Bryan's research started by introducing contextualized word vectors for transfer learning, then focused on multitask learning before branching out into several directions: commonsense reasoning, large-scale language models, and abstractive text summarization. He previously worked with teams that brought Einstein Intent and NER systems to production, currently leads research for Einstein Voice Assistant, and also consults for key conversational NLP pipelines related to new applications of Einstein Voice.

3/6: Learning Information Aggregation for Neural Machine Translation

Time:

Location: Gates 463A (4th floor, B wing)

Abstract:
The popular deep self-attention, i.e. the Transformer architecture, has advanced the state of the art in various natural language processing tasks. The strength of Transformer lies in its ability to capture different linguistic properties of the input by different layers and different attention heads. However, current models only utilize the last layer and linearly combine all attention heads for subsequent processes, which may not be expressive enough to fully capture the rich information in all layers and attention heads. In this talk, I will discuss how to effectively aggregate the representations learned by different components in Transformer. Specifically, I will first introduce several strategies to fuse information across layers, including layer aggregation and multi-layer attention. Then I will introduce a unified method to aggregate both layer representations and attention heads, based on low-rank bilinear pooling. Experiments on the machine translation tasks demonstrate the effectiveness of the proposed strategies.

Speaker: Jian Li
Jian Li is a fifth-year Ph.D. student at The Chinese University of Hong Kong. Currently, he is a visiting student at Prof. Monica Lam's group and working on natural language programming. His research interests include deep learning applied to natural language processing as well as programming language processing.