Creating a GenePattern Module

Posted on Saturday, December 15, 2012 at 11:36AM by The GenePattern Team

The following tutorial shows you how to create a new GenePattern module (in GenePattern 3.4 and up). Only the GenePattern team can create or install modules on the GenePattern public server. Therefore, to create a module, you need to have a local GenePattern server installed (see the download and installation page). You may also be interested in the video tutorial: Create a module in GenePattern.

In this tutorial, you will create a module named log_transform. The module invokes a perl script, log_transform.pl, which log-transforms all positive values in a data set and sets all negative or zero values to zero. Before you begin, download the perl script and its documentation:

In GenePattern, to create the log_transform module:

  1. Click Modules & Pipelines>New Module. GenePattern opens the Module Integrator window.
    • Give it a Name: LogTransform
  2. Enter the following information in the Details fields:
    • Description: Log transform a data (gct) file.
    • Documentation: Upload the LogTransform.pdf file provided above. This is the documentation for the module. When a GenePattern user displays your module and clicks the Documentation button, GenePattern displays this file.
    • Author: Your name.
    • Organization: Your affiliation.
    • License: Do not add a license file. (This is to add a text file for modules that require the user to agree to an end user license agreement [EULA] before use.)
    • Version Comment: This module is new, so enter "Initial version."  This field is used to describe what has changed since the last version of the module. 
    • Module Category: Select Preprocess & Utilities from the drop-down menu.
    • Privacy: Select private. This means that only you (or an administrator) can view and run the module. Public allows all users connected to this server to view and run the module.
    • Quality Level: Select preproduction. This indicates that you have finished development, but are not yet ready for production.
    • CPU type: Select any.
    • Operating System: Select Any.
    • Language: Select Perl.
    • Output File Formats: Select gct from the list of formats.
  3. Use the Support Files section to upload the perl program (the .pl file you downloaded before starting the tutorial):The Module Integrator Details and Support Files sections should look like this now:

    You can click the blue arrows to the left of Details and Support Files to close those parts of the window and give you more room for working with your parameters.
    1. Click the Add files... button in the Support files field. GenePattern displays the File Upload window.
    2. Select the log_transform.pl file and click Open. This is the script that implements the module.
  4. In the Parameters section, enter "2" and click Add Parameter.  This gives you 2 blank parameter fields to work with.
  5. Describe your two program parameters: input.filename and output.file. The parameter names and descriptions that GenePattern displays when a user runs your module are the parameter names and descriptions that you provide here. In the first parameter, enter the following information for input.filename:
    • name: input.filename
    • description: The dataset to be transformed (gct format).
    • Flag: "-F "
      • minus the quotes and pay attention to the space after the -F
    • Type of field to display: select File Field
    • file format: select gct
  6. In the second parameter, enter the following information for output.file:
    • name: output.file
    • Flag: "-o "
      • minus the quotes and pay attention to the space after the -o
    • description: The name of the new transformed file.
    • Type of field to display: choose Text Field
    • Type of data to enter: choose Text (this is a name that the user enters)
  7. Review your command line. It should look like this (without the quotes): "<perl> <libdir>log_transform.pl -F <input.filename> -o <output.file>"  The command line is a combination of fixed text and variables defined by GenePattern. This allows the command line to be independent of the operating environment and allows different values to be specified at different invocations of the command. This command line uses the following variables:
    1. <perl> represents the full path to the perl installation used by GenePattern.
    2. <libdir> represents the full path to the directory that contains the files for this module, including the program file.
    3. The perl script, log_transform.pl, expects two parameters, an input file and an output file name: -F <input.filename> -o <output.file>. When your program has parameters, you include them in the command line and also define them in the Parameters field, as described below.
  8. Click Save. GenePattern displays a message informing you that the module has been saved.
  9. Click Run to confirm that it has been added to the GenePattern server correctly.

Back to Blog