Data Sets

GenePattern Tutorial Datasets and files used in the GenePattern Tutorial
gp_tutorial_files.tar.gzTutorial files (gzip format)
ALL/AML Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression
Golub and Slonim et al., 1999
all_aml.tar.gzAll files (gzip)
   all_aml_train.gctTraining data set
   all_aml_train.resTraining data set
   all_aml_train.clsTraining class vector
   all_aml_test.gctTest data set
   all_aml_test.resTest data set
   all_aml_test.clsTest class vector
   Golub_et_al_1999.RMethodology (R script)
Lymphoma Outcome Prediction Diffuse Large B-Cell Lymphoma Outcome Prediction by Gene Expression Profiling and Supervised Machine Learning
Shipp et al., 2002
dlbcl.tar.gzAll files (gzip)
   dlbcl_vs_fscc.resDLBCL vs. FL morphology data set
   dlbcl_vs_fscc.clsDLBCL vs. FL morphology class vector
   dlbcl_outcome.resDLBCL outcome data set
   dlbcl_outcome.resDLBCL outcome class vector
Global Cancer Map Multi-Class Cancer Diagnosis Using Tumor Gene Expression Signatures
Ramaswamy et al., 2001
GCM.tar.gzAll files (gzip)
   GCM_Total.resComplete data set
   GCM_Total.clsClass vector for complete set
   GCM_Normal.resNormal samples
GISTICAssessing the significance of chromosomal aberrations in cancer: Methodology and application to glioma
Beroukhim et al., 2007
GISTIC_Hind_subset.zipAffymetrix 250K Hind chip files (12 of 187 samples, 166MB)
sample_info_subset.txtSample information file