This month’s Topic Webinar

with guest instructor Jessica Thomson, Ph.D.

Correspondence analysis is a powerful exploratory multivariate technique for categorical variables with many levels. It is a data analysis tool that characterizes associations between levels of two or more categorical variables using graphical representations of the information in a contingency table. It is particularly useful when categorical variables have many levels.

This presentation will give a brief introduction and overview of the use of correspondence analysis, including a review of chi square analysis, and examples interpreting both simple and multiple correspondence plots.

jessica-thomsonDr. Jessica Thomson holds a PhD in mathematical statistics and currently works as a research epidemiologist for the USDA Agricultural Research Service’s Delta Human Nutrition Research Program.  Dr. Thomson has a broad background in statistics, with specific emphasis on nutritional epidemiology as it relates to obesity.  Her current projects include the design, implementation and evaluation of nutrition and physical activity interventions targeting the prevention of obesity in adults and children, as well as identification of dietary patterns in nationally representative child datasets.

October 14, 2015 at 3:00pm EDT (GMT -4)

Note: this webinar is only available to Data Analysis Brown Bag members.

DABB_logoCould you use some affordable ongoing statistical training with the opportunity to ask questions about statistical topics? Consider joining our Data Analysis Brown Bag program.


September 2015 Membership Webinar: Smoothing

Smoothing can assist data analysis by highlighting important trends and revealing long term movements in time series that otherwise can be hard to see. This presentation is pitched towards those who may use smoothing techniques during the course of their analytic work, but who have little familiarity with the techniques themselves.

Read the full article →

Generalized Linear Models in R, Part 7: Checking for Overdispersion in Count Regression

In my last blog we fitted a generalised linear model to count data using a Poisson error structure. We found, however, that there was overdispersion in the data – the variance was larger than the mean in our dependent variable. One way to deal with overdispersion is to run a quasipoisson model, which fits an extra dispersion parameter to account for that extra variance..

Read the full article →

Generalized Linear Models in R, Part 6: Poisson Regression for Count Variables

In my last couple articles, I demonstrated a logistic regression model with binomial errors on binary data in R’s glm() function. But one of wonderful things about glm() is that it is so flexible. It can run so much more than logistic regression models. The flexibility, of course, also means that you have to tell it exactly which model you want to run, and how..

Read the full article →

Generalized Linear Models in R, Part 5: Graphs for Logistic Regression

In my last post I used the glm() command to fit a logistic model with binomial errors to investigate the relationships between the numeracy and anxiety scores and their eventual success. Now we will create a plot for each predictor. This can be very helpful for helping us understand the effect of each predictor on the probability of a 1 response on our dependent variable…

Read the full article →

Generalized Linear Models (GLMs) in R, Part 4: Options, Link Functions, and Interpretation

Last year I wrote several articles that provided an introduction to Generalized Linear Models (GLMs) in R. As a reminder, Generalized Linear Models are an extension of linear regression models that allow the dependent variable to be non-normal. In our example for this week we fit a GLM to a set of education-related data…

Read the full article →

Stata Loops and Macros for Large Data Sets: Quickly Finding Needles in the Hay Stack

I recently opened a very large data set titled “1998 California Work and Health Survey” compiled by the Institute for Health Policy Studies at the University of California, San Francisco. There are 1,771 observations and 345 variables…

Read the full article →

August 2015 Membership Webinar: Latent Class Analysis

Latent Class Analysis is a method for finding and measuring unobserved latent subgroups in a population based on responses to a set of observed categorical variables.

Read the full article →

Random Intercept and Random Slope Models

This free, one-hour webinar is part of our regular Craft of Statistical Analysis series. In it, we will introduce and demonstrate two of the core concepts of mixed modeling—the random intercept and the random slope.Most scientific fields now recognize the extraordinary usefulness of mixed models, but they’re a tough nut to crack for someone who didn’t […]

Read the full article →

Using the Collapse Command in Stata

Have you ever worked with a data set that had so many observations and/or variables that you couldn’t see the forest for the trees? You would like to extract some simple information but you can’t quite figure out how to do it. Get to know Stata’s collapse command…

Read the full article →