Andrei Boutyline
Andrei Boutyline

University of California, Berkeley
PhD Candidate, Department of Sociology
Institute for the Study of Societal Issues



See below for my implementation of the Correlational Class Analysis (CCA) algorithm, as well as for my minimal version of the ASA Zotero style sheet. Please treat them as beta software and use at your own risk, and please email me with bugs/suggestions.
Correlational Class Analysis

Correlational Class Analysis (CCA) greatly improves the accuracy and speed of the Relational Class Analysis (RCA) algorithm developed by Goldberg (2011). It partitions the respondents of a survey dataset into schematic classes, such that members of each class have similar patterns of association between their responses, suggesting the possibility of a shared cultural schema. Goldberg (2011) explains why such shared schemas sociologically interesting, and suggests a complex technique for identifying them via a novel measure he calls relatioinality. In this paper, I clarify this reasoning, and demonstrate that a far simpler technique based around Pearson's row correlations (CCA) can reliably identify such classes with greater accuracy than RCA. It also works much faster. See appendix at the back of the paper for further explanation of the algorithm.

The corclass 0.1 R package implementing CCA is available on CRAN. It can be installed by running:
  > install.packages("corclass")

This implementation makes heavy use of igraph. For more accurate results, make sure your igraph version is at least 0.7.

The R package comes with documentation and sample data, but if want to download just the R script on its own, you can do so here:[cca.R].

ASA Zotero style sheet

I made a modified version of the ASA style sheet aimed to minimize the need to clean up the auto-generated references and in-line citations by hand. For example, citations to books or journal articles no longer include the URLs they came from, as these URLs are often arbitrary and uninformative. Also having multiple entries with the same last name but different first names in the Zotero database no longer causes the in-line citations to automatically include first and middle initials, since name differences are more often than not due to minor inconsistencies in the database entries (e.g., mine contains entries for Paul Lazarsfeld and Paul F. Lazarsfeld). There are also changes to capitalization and entry ordering.

You can [download the style sheet here]. Instructions for installing the style sheet can be found here (with Zotero Standalone, double-clicking the file should do the trick).

Update 11/17/13: references list is now double spaced; fixed a bug that kept the URL out from the formats that needed it; style sheet now fully matches CSL specs.