About

I’m Geoff Thompson, a Data Scientist at Cisco. Until recently, I was a Visiting Assistant Professor at Indiana University. I have a PhD from Iowa State University (2021). My research interests are in clustering and other classification problems as well as unsupervised or semisupervised learning. I have worked in the statistical consulting group at Iowa State University and Indiana University in addition to my research. You can see my software on my github and, below, in my list of projects, you can see some of the work I’ve done.

I work mostly on the “Model” part of the “Import, Tidy, Transform, Visualize, and Model Data” formulation of Data Science. The applications have mostly been image-related (forensics and medical imaging), but I don’t care what things look like. I also enjoyed working with the statistical consulting group for a couple years, since other people’s problems are more interesting than my own; I get to work on the interesting statistics and learn a little about a new field,
then they get to do the hard work of finishing the project and writing – after we make sure we both understand it. For a moment, you feel like an expert in a completely foreign field.

I also like long walks on the beach.

Awards

  • 2021 ISU Graduate Research Excellence Award
  • 2021 ASA Section on Statistics in Imaging Student Paper Award
  • 2020 ISU Graduate Teaching Excellence Award
  • 2017 ISU Departmental Award for Excellence in Statistical Consulting
  • 2015 ISU Departmental Award for Excellence in Statistical Computing
  • 2014 ASA Statistical Computing Section Student Paper Award

Projects

  • Forensic matching of knife fragments using matrix variate discriminant analysis and some other applications of matrix variate discriminant analysis.

  • MixMatrix an R package for working with matrix variate distributions such as the matrix variate normal or matrix variate t-distribution performing discriminant analysis and clustering. On github and CRAN.

  • CholWishart an R package for some distributions and functions related to the Wishart, in particular sampling from the Cholesky factorization of the Wishart and inverse Wishart. On CRAN and github.

  • Porting scalable k-means++ (or k-means||) to OpenMPI (a minor work).

  • A small C program to replace the use of the Rmath standalone library for random number generation.

  • CatSIM, a similarity metric for binomial and multinomial image comparison accounting for structural similarity, after the model of the MS-SSIM metric. See on Arxiv and on CRAN

  • In progress: rNFFT, an R wrapper for the NFFT library (nonequispaced nodes Fast Fourier Transform). There are a lot of functions in the NFFT library. What I have works, though it doesn’t save and re-use “plans”, which would make it more useful. If I had an application for it other than the radon and inverse radon transform, I would revisit it.

  • In progress: a parallelization of Hartigan and Wong’s k-means algorithm for OpenMPI. This was the third chapter of my dissertation.

  • In progress: a clustering-based image compression algorithm - see the 2014 award. This was the first chapter of my dissertation.

  • In progress: multinomial clustering with covariates.

  • In progress: even more.

Other activities

  • STATCOM - Statistics in the Community. Formerly the president of the Iowa State University chapter. Provides outreach to the greater community - statistical consulting to non-profits as well as educational and informative events.
  • President of the Parish Council - 2015-16, Holy Transfiguration Orthodox Christian Church, Ames, IA.

Prior work

  • Human capital analyst in the HR department of Health Care Service Corporation for 5 years. Providing analytics and data-driven decision making for the Human Resources department in their consultations with internal customers. There were some cool ideas here but it made me realize that I needed a bit more formal grounding in this and decided to get a PhD in statistics as a result.