Need Help?

The Collaborative Study on the Genetics of Alcoholism (COGA)

COGA is a family study of alcoholism, in which the subjects have been drawn from the Collaborative Study on the Genetics of Alcoholism (COGA), a large, ongoing family-based study that includes subjects from seven sites around the US. COGA has gathered detailed, standardized data on study participants, including diagnostic and neurophysiological assessments. This project has already proved successful in identifying several genes that influence the risk for alcoholism and neurophysiological endophenotypes, which have been independently replicated. COGA data were included as part of two Genetic Analysis Workshops, and the phenotypes are familiar to the genetics community.

Alcoholic probands were recruited from treatment facilities, assessed by personal interview, and after securing permission, other family members were also assessed. A set of comparison families was drawn from the same communities as the families recruited through an alcoholic proband. Assessment involved a detailed personal interview developed for this project, the Semi-Structured Assessment for the Genetics of Alcoholism (SSAGA), which gathers detailed information on alcoholism related symptoms along with other drugs and psychiatric symptoms. Many participants also came to the laboratories for electroencephalographic studies. Neurophysiological features that have been shown to be useful endophenotypes for which we have linkage and in some cases association results, are included for a subset of the case-control sample: the beta power of the resting electroencephalogram (EEG), the P3(00) amplitude of the visual event-related potential (ERP), and the theta and delta event-related oscillations (EROs) underlying the P3.

As part of COGA, a set of informative families was selected to have Genome-Wide Association data obtained within families. Genotyping was performed using the Illumina Human OmniExpress array 12.VI to genotype 2,282 subjects selected from 118 densely affected families. Genotyping was performed at the Genome Technology Access Center at Washington University School of Medicine in St. Louis. In addition, we also included genotypes for subjects (n=275 subjects) from these 118 families who were genotyped in a previous case-control GWAS using the Illumina 1M array. For quality control purposes, 51 of the 275 subjects were genotyped again on the Illumina Human OmniExpress array at the Washington University School of Medicine core facility.

In addition, exome sequencing data on a subset of individuals with GWAS were added in version 2 (v2).

For v2, a subset had 30X Whole Genome Sequencing (WGS) as part of the NIDA Sequencing Initiative. The subset contained two distinct sets: Sibling pairs where one sibling had at least two dependence diagnoses in the set (alcohol, cannabis, cocaine, and opioid), and the other had none, and non-related Case-Control pairs matched for age and ethnicity where the cases had alcohol and at least 2 other dependence diagnoses and controls had none. After sequencing, some sibling pairs are re-classified as half siblings. Three VCF files (small variants, structural variants, and copy number variations) are provided.

Additional substance use variables are made available in v2.

We note that the full sample data are deposited in four dbGaP submissions and the sequenced samples are split across all four:

CIDR: Collaborative Study on the Genetics of Alcoholism Case Control Study [phs000125]. GWAS data on cases (primarily probands) and controls drawn from the families.

Families with highest density of alcohol dependence and/or extreme event-related oscillation data [phs000763]. GWAS data on 119 extended families of European descent are available here, along with extensive documentation.

Study on the Genetics of Alcoholism (COGA): African American Family GWAS [phs000976]. GWAS data on all available COGA families of African descent are available.

COGA: Smokescreen GWAS [phs001208]. GWAS data on all remaining COGA DNA samples, primarily of other racial background, were genotyped on the Smoke Screen array.

A listing of all sequenced pairs is provided in the documentation to facilitate the merging of these samples.