Soniox mission is to accelerate the adoption of speech-based applications and spark innovation of human-machine voice interaction. We have developed the most accurate speech recognition system and made it freely available for anyone to use.
Developed a novel neural network model for speech activity detection for Facebook videos that recognizes human speech segments in the audio stream. The model is fast and accurate. Integrated the model into production and achieved 40%+ reduction in transcribed audio data on Facebook videos without affecting the Word Error Rate.
We propose a set of concrete desiderata for general AI, together with a platform to test machines on how well they satisfy such desiderata, while keeping all further complexities to a minimum.
https://arxiv.org/abs/1701.08954Co-organized the Machine Intelligence at NIPS 2016. The workshop aimed to stimulate theoretical and practical advances in the development of machines endowed with human-like general-purpose intelligence, focusing in particular on benchmarks to train and evaluate progress in machine intelligence.
Release of environment for Communicated-based AI, a platform for training and evaluating AI systems on communication-based tasks.
Designed and developed DeepText models for intent classification and slot extraction for Facebook Messenger. Applied DeepText models to recognize ride intents for Uber, Lyft, and Taxi in Messenger, and extract object and price in the for-sale Facebook posts.
M.S. Thesis: Concept Aware Co-occurrence and its Applications
Studied the problem of learning concept-level representations from large amount of unstructured text data. Developed a structured prediction model for short text understanding: segmentation and disambiguation of phrases into concepts with limited syntax and context information.
Teaching Assistant at the University of Utah for two classes: Software Practice and Database Systems. Organized and taught lab projects, held office hours, graded projects, assignments and exams.
Internship at Google Research with Haixun Wang. Developed models for search query understanding: learning segmentation and disambiguation of phrases in queries. The model improved the understanding of intents and products in the Google Shopping search queries.
Undergraduate Thesis: Network structural properties and their application to missing property prediction.
Studied the problem of predicting missing relation-types for objects in large-scale knowledge bases, DBpedia and Freebase (Google Knowledge Graph).
Designed and developed a library for generic implementation of data structures that support complex queries on data (e.g. combination of Balanced Tree with Hash Table and Double Linked List). The library achieves high performance and low memory usage, and can be easily embedded into other projects.
Research internship at Stanford with Prof. Jure Leskovec. Worked on discovery of network motifs (statistically significant sub-graphs), and used the network motifs with SVM to predict the missing links in information networks.
Research internship at Stanford with Prof. Jure Leskovec. Investigated and developed methods for community detection in large social networks based on Clique Percolation Method.
First year undergraduate research project: A New Algorithm for Finding Frequent Items in Streams of Data. We present a new algorithm for finding frequent items in a stream of data that requires a small fraction of resources compared to the total quantity of data.