What is GEPAS
GEPAS, which stands for "Gene Expression Pattern Analysis Suite", was born as a natural response to the requirements of data analysis of many experimental researches, working on gene expression using DNA microarrays.
GEPAS does not intend to be a collection of as-many-methods-as-possible, but an integrated tool able to provide a simple, easy but rigorous solution to most concise (and some complex) scientific questions that can arise in the step of microarray data analysis.
Contrarily to other tools, GEPAS has "grown up" in close contact with experimental researchers and has been continuously improved as new scientific questions were arising. GEPAS is committed to evolve as new scientific questions emerge in relation to the use of gene expression profiles.
Who is behind GEPAS
GEPAS has been developed under the supervision of Joaquin Dopazo, head of the Department of Bioinformatics (http://bioinfo.cipf.es) at the Centro de Investigación Príncipe Felipe, Valencia, Spain (http://www.cipf.es). An important part of what GEPAS is nowadays can be considered a major contribution of Javier Herrero in terms of ideas, design and internal architecture. GEPAS, anyway, can be considered the collective effort of a group of people, the GEPAS team (see below) ad some occasional collaborators.
Supervisor
The GEPAS team
- Fatima Al-Shahrour
- Eva Alloza
- Lucia Conde
- Javier Herrero
- Jaime Huerta
- Ignacio Medina
- Pablo Minguez
- David Montaner
- Sach Mukherjee
- Miguel A.G. Pujana
- Joaquin Tarraga
- Joan Valls
- Juan Manuel Vaquerizas
- Javier Vera
Former members of the GEPAS team
- Amaya Cabezon (2004)
- Ramon Diaz (2004)
- Alvaro Mateos (2005)
- Javier Santoyo (2004)
- Patricio Yankilevich (2004)
A bit of history
The first program implemented as a web server was the SOTA algorithm in late 2000 at the CNIO. Then, new tools such as the preprocessor, more clustering methods (SOM, hierarchical), methods for gene selection (pomelo tool), supervised clustering and for functional annotation (the popular FatiGO) were added during 2001 within an integrated environment.
The first official release of GEPAS was in mid 2002, in the CNIO, and was soon published in the first special web issue of NAR in 2003 (Herrero et al., 2003).
During 2003 GEPAS was oriented towards gene selection and prediction. New versions of pomelo (gene selection) and the Tnasas program (predictor) were added to GEPAS. A module for normalization of two-color arrays DNMAD was also included. Support for array-CGH was included for the first time as a viewer: the InSilicoCGH tool. Finally, another module was added for functional annotation: the FatiWise. This second version was officially released in the special web server issue of NAR in 2004 (Herrero et al., 2004).
During 2004 GEPAS had more additions such as the k-means clustering algorithm and improvements in programs such as DNMAD, InSilicoCHG, etc. Nevertheless the largest addition was the complete suite for functional annotation, babelomics. During 2004, GEPAS was used for analysing more than 75,000 experiments, with a average of 300 experiments analysed per day. This version was officially released in the special web issue of NAR in 2005 (Vaquerizas et al., 2005)
In August 2005 the version v2.0 was released. A completely new interface was provided and new tools were added. Among them we can cite the CAAT, a hierarchical cluster analyser and viewer, and the ISACGH, a tool for estimating copy number in arrayCGH experiments, including visualization of results. Also, normalization for affy arrays was provided.
In February 2006 the version v2.3 was released. CAAT has been fully integrated in the clustering section. Differential gene expression was expanded beyond the simple t-test and new, more reliable tests, such as data-adaptive test, SAM, Bayesian regularised t-test, etc., have been included (pomelo module was discontinued). ISACGH was improved with a DAS server that allows visualization of the results over the Ensembl and functional annotation is provided through a direct link to Babelomics. A new database schema for cross-equivalence of the different gene identifiers and the functional terms was also implemented. Although t is not visible for the users it increases the number of available gene ID equivalences. The Babelomics suite was also improved. A new interface was also added, with the possibility of checking functional enrichment in heterogeneous terms (GO, pathways, etc) simultaneously. More terms have been added to FatiScan. We have also implemented the GSEA method.
Feedback
For comments, bug reports, suggestion for improvement, please contact us.
When reporting a bug or any apparent misfunction, please try to include as much information as possible about the problem, such as the programs and ALL the options you used. In many cases the data used help a lot too.
|