Saturday, July 28, 2012

Statistics and Skewed Samples

An unsophisticated forecaster uses statistics as a drunken man uses lamp-posts - for support rather than for illumination.

- Andrew Lang

I've recently been doing statistical tests on coauthorship data we collected through our NSF Advance grant at New Jersey Institute of Technology. We're looking at questions such as gender differences and models for success in terms of promotion and rank.

The problem is that we have very few female faculty at NJIT. This is a problem endemic to Science, Technology, Engineering, and Math (what many call STEM). Women either choose not to go into those fields, or once in them, they leave. Often they leave because of harsh or hostile environments from colleagues, which is why we are hoping to see if women are excluded from the critical collegial social networks within the technological university setting.

Statistical tests rely on random samples of a population. In this case, however, we are sampling the entire population, but a population that is very heavily skewed towards male faculty. In looking at Associate Professors over a 10 year period, we do not have enough female faculty to even begin to assume a normal distribution. That then begs the question of how we can test our hypotheses regarding the exclusion of women when so few women even exist in our sample?

It's a quandary we're still trying to solve. We're trying various means (permutation tests, bootstrapping) but we may have to resort to simple qualitative descriptions. We're also using social network analysis methods to look for homophily in coauthorships - that is, do men prefer to coauthor with men and women prefer to coauthor with women, or is there also ample inter-gender collaboration? Results will hopefully be forthcoming.

No comments:

Post a Comment