DIANA CAI

Ph.D. Candidate, Princeton University
Email: dcai-at-cs-dot-princeton-edu

Research interests: I am broadly interested in statistical modeling, theory, and applications, with an emphasis on scalable, nonparametric learning.

Selected Work

View all papers

About Me

I am a graduate student at Princeton University studying machine learning and Bayesian statistics. Previously, I received an M.S. in statistics at the University of Chicago, and an A.B. in computer science and statistics from Harvard University, where I was a member of the Harvard Intelligent Probabilistic Systems (HIPS) Group. I was an organizer for the 2016 Women in Machine Learning Workshop in Barcelona, Spain.



Past travel & updates

Projects

Projects, by topic:

  1. probabilistic modeling,
  2. graphs and networks,
  3. other projects.



Probabilistic modeling and scalable inference


Probabilistic modeling is a powerful tool for understanding data in a wide-variety of applications. My research focuses on developing and analyzing the properties of flexible probabilistic models for unsupervised learning, such as clustering, feature modeling, topic modeling, and network modeling.



Exchangeable trait allocations

(with Trevor Campbell and Tamara Broderick).

We study exchangeable trait allocations, a class of combinatorial models that generalizes partitions, feature allocations, and more. In this work, we characterize the class of exchangeable trait allocations, which we call the trait paintbox. We also characterize a subclass of models particularly amenable to MCMC and variational inference algorithms. We show how constrained trait allocations can be applied to graphs as a generalization of edge-exchangeable graphs and hypergraphs; additional details can be found here.
Submitted, 2016.
pdf arxiv

Preliminary version in the NIPS 2016 Workshop on Practical Bayesian Nonparametrics, 2016. pdf




Efficient variational approximations for
online Bayesian changepoint detection

(with Ryan P. Adams).

We develop a scalable method for online changepoint detection in conditionally-conjugate latent variable models, detecting global changes in the latent components. We develop a scalable online variational inference algorithm that adaptively decreases the number of sufficient statistics, and demonstrate inference on mixture models and topic modeling applications.
github



Graphs and networks


Many modern data sources are generated by complex interactions between entities, such as online social and communication networks, biological networks including gene and protein interaction networks, and databases. My research focuses on developing nonparametric methods and theory for random graphs and relational data.



Edge-exchangeable graphs: sparsity, power laws,
paintboxes, and probability functions

(with Trevor Campbell and Tamara Broderick).

We study edge exchangeability; here the order of the edges does not affect the distribution of the graph. We show that, unlike many popular graph models that are traditionally vertex exchangeable, that edge exchangeability admits sparsity and power laws. We also characterize the class of edge-exchangeable graphs and a subclass that is particular amenable to posterior inference.

Edge-exchangeable graphs and sparsity.
Advances in Neural Information Processing Systems (NIPS), 2016.
pdf arXiv spotlight video poster

    Preliminary versions appeared as:
  • Completely random measures for modeling power laws in sparse graphs.
    NIPS workshop on Networks in the Social and Information Sciences, 2015. pdf
  • Edge-exchangeable graphs and sparsity.
    NIPS workshop on Networks in the Social and Information Sciences, 2015. pdf
  • Edge-exchangeable graphs, sparsity, and power laws.
    NIPS Workshop on Bayesian Nonparametrics: The Next Generation, 2015. pdf

Paintboxes and probability functions for edge-exchangeable graphs.
NIPS Workshop on Adaptive and Scalable Nonparametric Methods in Machine Learning, 2016. pdf poster slides






Priors on exchangeable directed graphs

(with Nate Ackerman and Cameron Freer).

Exchangeable directed graphs are characterized by a sampling procedure given by the Aldous-Hoover theorem, determined by specifying a distribution on measurable objects known as digraphons. We present a new Bayesian nonparametric model for exchangeable directed random graphs.
Electronic Journal of Statistics (EJS), 2016.
pdf arXiv poster slides

Preliminary version in the NIPS Workshop on Bayesian Nonparametrics, 2015.
Contributed talk in the 10th Conference on Bayesian Nonparametrics, 2015.




Iterative step-function estimation for graphons

(with Nate Ackerman and Cameron Freer).

We present a method for estimating graphons (symmetric, measurable functions from which we can sample exchangeable random graphs) by iteratively refining a partition. Here we compute the similarity of vertices based on their respective neighborhoods with repsect to the previous partition's edge densities. A step-function estimator is then obtained by grouping vertices by taking the average edge density across each pair of classes in the partition.
pdf arXiv poster





Other projects



The Ratio Project: Analyzing Online Recipes

We analyzed online recipes using computational methods and exploratory visualizations.
"A food pyramid made of cookies." The Boston Globe, Dec 2011. link
(with Elaine Angelino and Michael Brenner)
Cocktails Visualization, Jun 2013.
(with Elaine Angelino, Gabrielle Ehrlich, Brent Heeringa, Michael Mitzenmacher, Naveen Sinha).


Papers


Preprints and working papers

  1. Exchangeable trait allocations.
    Trevor Campbell, Diana Cai, Tamara Broderick.
    arxiv:1609.09147 [math.ST].
  2. An iterative step-function estimator for graphons.
    Diana Cai, Nate Ackerman, Cameron Freer.
    arXiv:1412.2129 [math.ST, stat.ML, stat.CO].

Journal and conference papers

  1. Edge-exchangeable graphs and sparsity. arxiv:1612.05519.
    Diana Cai, Trevor Campbell, Tamara Broderick.
    Advances in Neural Information Processing Systems (NIPS), 2016.
  2. Priors on exchangeable directed graphs. arxiv:1510.08440.
    Diana Cai, Nate Ackerman, Cameron Freer.
    Electronic Journal of Statistics (EJS), 2016.

Workshop papers

  1. Paintboxes and probability functions for edge-exchangeable graphs.
    Diana Cai, Trevor Campbell, Tamara Broderick.
    NIPS Workshop on Adaptive and Scalable Nonparametric Methods in Machine Learning, 2016. [[pdf]]
  2. A paintbox representation for exchangeable trait allocations.
    Trevor Campbell, Diana Cai, Tamara Broderick.
    NIPS Workshop on Practical Bayesian Nonparametrics, 2016. [[pdf]]
  3. Priors on exchangeable directed graphs.
    Diana Cai, Nate Ackerman, Cameron Freer.
    NIPS Workshop on Bayesian Nonparametrics: The Next Generation, 2015. [[pdf]]
    ISBA@NIPS Special Travel Award for Contributed Paper, 2015.
  4. Completely random measures for modeling power laws in sparse graphs.
    Diana Cai, Tamara Broderick.
    NIPS workshop on Networks in the Social and Information Sciences, 2015. [[pdf]]
    arxiv:1603.06915 [stat.ML, math.ST, stat.ME].
  5. Edge-exchangeable graphs, sparsity, and power laws.
    Tamara Broderick, Diana Cai.
    NIPS Workshop on Bayesian Nonparametrics: The Next Generation [[pdf]]
    ISBA@NIPS Special Travel Award for Contributed Paper, 2015.
    NIPS workshop on Networks in the Social and Information Sciences, 2015. [[pdf]]
    arxiv:1603.06898 [math.ST, stat.ME, stat.ML].

Theses

  1. Scalable methods for Bayesian online changepoint detection.
    Advisor: Ryan P. Adams
    Senior Thesis, Harvard University, 2014.


Selected Talks

  1. Edge-exchangeable graphs, sparsity, and power laws.
    Contributed talk in the 11th Conference on Bayesian Nonparametrics, June 2017.
  2. Edge-exchangeable graphs, sparsity, and power laws.
    Contributed talk in the Conference on Network Science (Netsci), June 2017.
  3. Paintboxes and probability functions for edge-exchangeable graphs.
    Contributed talk in the NIPS Workshop on Adaptive and Scalable Nonparametric Methods in Machine Learning, 2016.
  4. Edge-exchangeable graphs, sparsity, and power laws.
    Invited talk in the Isaac Newton Institute (INI) workshop on Bayesian methods for networks, July 2016. [[video link]]
  5. Edge-exchangeable graphs, sparsity, and power laws.
    Invited talk at the Massachusetts Institute of Technology, Machine Learning Tea seminar, July 2016.
  6. Edge-exchangeable graphs, sparsity, and power laws.
    Contributed talk in the NIPS workshop on Bayesian Nonparametrics: the Next Generation, December 2015. [pdf]
  7. Priors on exchangeable directed graphs.
    Contributed talk in The 10th Conference on Bayesian Nonparametrics, June 2015. [pdf]
  8. Efficient online variational changepoint detection.
    Machine Learning Tea Seminar, Harvard University, Feb 2013.

Poster presentations

  1. Paintboxes and probability functions for edge-exchangeable graphs.
    NIPS Workshop on Adaptive and Scalable Nonparametric Methods in Machine Learning, 2016.
  2. Completely random measures for modeling power laws in sparse graphs.
    NIPS workshop on Networks in the Social and Information Sciences, Dec 2015.
  3. Edge-exchangeable graphs, sparsity, and power laws.
    NIPS workshop on Bayesian Nonparametrics: The Next Generation , Dec 2015.
    NIPS workshop on Networks in the Social and Information Sciences, Dec 2015.
  4. Priors on exchangeable directed graphs.
    NIPS workshop on Bayesian Nonparametrics: The Next Generation, Dec 2015.
    Women in Machine Learning Workshop, Dec 2015.
  5. An iterative step-function estimator for graphons.
    Women in Machine Learning Workshop, Dec 2014.
  6. Efficient variational approximations for online Bayesian changepoint detection. New England Machine Learning Day Workshop, May 2014.
  7. Efficient variational approximations for online Bayesian changepoint detection. Women in Machine Learning Workshop, Dec 2013.

Teaching

University of Chicago

  • STAT 20000: Elementary Statistics. Teaching Assistant: Fall 2016.
  • STAT 22000: Statistical Methods and Applications. Teaching Assistant: Winter 2016, Spring 2016, Spring 2017

Harvard University

  • CS181: Machine Learning. Teaching Fellow, Spring 2014