This vignette describes how to prepare the data to run a gene expression meta-analysis of lung cancer subtypes according to the method of Hughey and Butte (2015). You will need an internet connection to download the expression data and the custom cdf packages.

After completing the steps in this vignette, you can move on to run the meta-analysis described at vignette('run_example').

The meta-analysis will use data from the following four studies:

Instructions

  1. Install the required custom CDF packages by running the following command in R.

    metapredict::installCustomCdfPackages(c('hgu95av2hsentrezgcdf',
                                            'hgu133plus2hsentrezgcdf'))
  2. Download each of the following files (we’re only downloading a subset of the Bhattacharjee data):
  3. Create a folder called metapredict_example. Inside metapredict_example, create a folder called expression_data. Inside expression_date, create a folder called Bhattacharjee.
  4. Unzip the files for the Bhattacharjee dataset and move all the .CEL files to the Bhattacharjee folder.
  5. Unzip GSE11969_series_matrix.txt.gz and GSE29016_series_matrix.txt.gz and move the resulting .txt files to the expression_data folder.
  6. Unzip GSE30219_RAW.tar, rename the resulting folder GSE30219, and move it to the expression_data folder.