### #statistics and #machinelearning

• 16 Feb 2018 » Natural gradient descent and mirror descent

In this post, we discuss the natural gradient and present the main result of Raskutti and Mukherjee (2014), which shows that the mirror descent algorithm is equivalent to natural gradient descent in the dual Riemannian manifold….

• 21 Nov 2017 » The Johnson-Lindenstrauss Lemma

The so-called curse of dimensionality reflects the idea that many methods are more difficult in higher-dimensions. This difficulty may be due a number of issues that become more complicated in higher-dimensions…

• 15 Nov 2017 » References on Bayesian nonparametrics

This post is a collection of references for Bayesian nonparametrics that I’ve found helpful or wish that I had known about earlier….

• 21 Oct 2017 » Wavelets and adaptive data analysis

For data that have a high signal-to-noise ratio, a nonparametric, adaptive method might be appropriate. In particular, we may want to fit the data to functions that are spatially imhomogenous, i.e., the smoothness of the function $f(x)$ varies a lot with $x$. In this post, we will discuss wavelets, which can be used an adaptive nonparametric estimation method….

• 25 Jan 2015 » Important inequalities for probabilistic methods

In statistics, machine learning, theoretical computer science, and anything that incorporates randomness, we are interested in studying its behavior (e.g., asymptotic convergence rates, approximation error) using probabilistic tail bounds….

### #probability

• 04 Jan 2015 » Measure theory basics

Those interested in machine learning may be wondering why they should be familiar with measure theory. One of the main reasons has to do with why we need measure theory for probability in general….

### #computerscience

• 05 Nov 2017 » Online decision making under total uncertainty

In this post, we will discuss a few simple algorithms for online decision making with expert advice. In particular, this setting assumes no prior distribution on the set of outcomes, but we use hindsight to improve future decisions. The algorithms discussed include a simple deterministic and randomized majority weighted decision algorithm, and the multiplicative weights algorithm….

• In this post, we’ll review linear systems and linear programming. We’ll then focus on how to use LP relaxations to provide approximate solutions to other (binary integer) problems that are NP-hard….

• 29 Dec 2014 » Perfect Hashing with a 2-Universal Hash Family

Previously, we discussed universal hash families. In this post, we discuss a method of using a 2-universal hash family along with a Las Vegas algorithm to allow for perfect hashing, where the time required to find an item in a hash table is constant….

• 06 Sep 2014 » Universal hash functions

Hashing is a general method of reducing the size of a set by reindexing the elements into $$n$$ bins. This is done using a hash function, which maps some set $$U$$ into a range $$[0, n-1]$$. When designing a hash function, we are interested in something that maps elements into a bin in a way that appears random….

### #misc

• 11 Aug 2018 » Julia v1.0 First Impressions

The Julia Programming Language has finally released version 1.0! I was still using version 0.6 when pretty much suddenly both v0.7 and v1.0 came out (thanks JuliaCon). So I didn’t get to actually try out a lot of the new features until now. Here I’ll document the installation process and some first impressions….

• 28 Jun 2018 » My LaTeX Setup - vim, Skim, and latexmk

I’ll describe my current LaTeX setup – vim, skim, and latexmk. I currently write notes and papers using LaTeX on a Mac, though in a previous life, my setup on Linux was very similar…

• 12 Sep 2015 » Making this blog with Jekyll, Hyde, MathJax, …

I recently switched my old Wordpress.com blog over to using Jekyll. The main reasons were due to better math display and easier $\LaTeX$ parsing but also because it was easier to customize the layout….