Table of Contents

 

Blood Pressure

CASE STUDY

Last modified 2003-03-02 22:30

Français

Please check this page regularly for updates, corrections, and answers to frequently-asked questions!

Acknowledgements

Our appreciation goes out to Dr. Raymond Lam, GlaxoSmithKline, Toronto, Ontario, Canada for providing this case study.

 

Introduction 

Genes contribute to the development and progression of disease and they also influence how individuals respond to medicines. At GlaxoSmithKline (GSK), we are conducting genetic and genomic research which will allow the medical community to accurately prescribe the right medicine for the right patient. 

In genetics research studies often hundreds to thousands of genetic markers, together with many clinical measurements, are collected.  Statistical tools are useful for separating 'true' genes from 'false' alarms.

Data Description 
The data file (ascii file, comma delimited data file) contains 500 observations (subjects) and 501 variables.  Of the 500 subjects, 250 had low blood pressure and 250 had high blood pressure (i.e. hypertension).  The 501 variables consist of one response variable (systolic blood pressure) and 500 predictors (17 clinical covariates and 483 genetic markers).  These variables are described below.

The attributes (variables) in this study are:

Variable

Description

SysSystolic Blood Pressure (SBP)

Continuous response variable

Gender

Binary Variable:

M = Male, F = Female

Marital Status

Binary variable:

Y = Married, N = Not Married

Smoking Status

Binary variable:

Y = Smoker, N = Non-Smoker

     Age

Continuous variable (years)

Weight

Continuous variable (lbs)

Height

Continuous variable (inches)

Body Mass Index (BMI)

Continuous variable:

Weight / Height2 *703

Overweight

Categorical variable:

1 = Normal, 2 = Overweight, 3 = Obese.

Race

Categorical variable taking values 1, 2, 3, or 4.

Exercise level

Categorical variable:

1 = Low, 2 = Medium, 3 = High

Alcohol Use

Categorical variable:

1 = Low, 2 = Medium, 3 = High

Stress Level

Categorical variable:

1 = Low, 2 = Medium, 3 = High

Salt (NaCl) Intake Level

Categorical variable:

1 = Low, 2 = Medium, 3 = High

Childbearing Potential

Categorical variable:

1 = Male, 2 = Able Female, 3 = Unable Female

Income Level

Categorical Variable:

1 = Low, 2 = Medium, 3 = High

Education Level

Categorical Variable:

1 = Low, 2 = Medium, 3 = High

Treatment (for hypertension)

Binary Variable:

Y = Treated, N = Untreated

483 Genetic Markers

0_0, 0_1, 1_1

 

Objectives

For this case study, a genetic data set is generated based on a complex genetic model we developed at GSK.  There are 500 predictors (483 genetic markers and 17 clinical covariates).  The goal is to identify the 'true' predictors among the 500 variables and, at the same time, control the false discovery rate.  Therefore, the objectives are:

 

                  1.     Identify 'true' genes and clinical covariates; and

2.     Control False Discovery (number of true X's versus number of false X's identified).

Frequently Asked Questions

Please check this section regularly for updates.

References