Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This section presents information on tools used for genome annotation, sequence analysis, and sites for data retrieval.

Links need reviewing. Appearance on the list does not imply endorsement by TAIR.

Table of Contents
Gene Structural Annotation Tools

Links to the most popular tools used for genomic sequence annotation.

Software Downloads

Links to available open source software for genome annotation.

Multiple Sequence Alignment Tools

Links to multiple sequence alignment tools.

Comprehensive Sequence Analysis Resources

Launch sites for a variety of sequence analysis tools.

Comparative Resources

Genome comparison resources.

Plant Promoter and Regulatory Element Resources

...

Repeat Masker

RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked.

Codon Usage Database (Kazusa)

Codon usage tables for many organisms, including Arabidopsis thaliana, from the Kazusa Institute.

Arabidopsis GenSequer

AtGDB GeneSeqer webserver for predicting splice junctions in Arabidopsis sequences. Includes a tutorial on how to use the tool.

GENEMARK

Family of gene prediction programs provided by the Bioinformatics Group at the Georgia Institute of Technology.

TSSP-TCM

Plant promoter identification

WISE2

Wise2 compares a protein sequence to a genomic DNA sequence, allowing for introns and frameshifting errors.

GrailEXP

Software package that predicts exons, genes, promoters, polyas, CpG islands, EST similarities, and repetitive elements within DNA sequence.

GeneScan

MIT's new webserver for GeneScan. GeneScan is used to predict the location and intron/exon boundaries in a genomic sequence. Select Arabidopsis as the organism of choice for finding Arabidopsis genes in a genomic sequence.

NetPlantGene | NetGene2

Predictions of Arabidopsis splice sites from CBS.

NetStart

Prediction software for Arabidopsis translation starts from CBS.

GeneFinder

Search splice sites, protein coding exons and gene models construction, promoter and poly-A signals.

Software Downloads

Generic Model Organism Database (GMOD) Pages at Sourceforge

Everything you need to set up a MOD and annotate a genome- all open source software.

Sourceforge

The place to find and build open source software

BioConductor

Open source software downloads and open development environment for bioinformatics software.

Multiple Sequence Alignment Tools

CLUSTALW

Compares overall sequence similarity of multiple sequences.

MEME (Multiple EM for Motif Elicitation)

Analyzes your sequences for similarities among them and produces a description (motif) for each pattern it discovers.

Block Maker

Finds conserved blocks in a group of two or more unaligned protein sequences.

BOXSHADE

Highlights conserved residues of the resulting multiple sequence alignment.

CINEMA (Colour INteractive Editor for Multiple Alignments)

Editing tool that allows the user to manipulate the alignment.

SGN Alignment Analyzer

Aligns DNA or protein sequences and graphically displays the results. Accepts AGI codes as input as well as unaligned or aligned sequences.

SGN Tree Browser

Calculates and displays trees based on alignment data (accepts several different input formats)

Base-By-Base

Whole genome pairwise and multiple alignment editor.

Comprehensive Sequence Analysis Resources

BCM Launcher

Molecular biology-related search and analysis services from Baylor College of Medicine.

EBI Toolbox

List of bioinformatic tools and resources

GenePalette

A cross-platform and cross-species desktop application for genome sequence visualization and navigation.

Comparative Resources

HomoloGene

NCBI's system for automated detection of homologs among the annotated genes of several completely sequenced eukaryotic genomes.

Inparanoid

A collection of pairwise comparisons between 17 eukaryotic whole genomes including Arabidopsis thaliana, useful for the identification of orthologs and differentiation between inparalogs and outparalogs.

Phytozome

Contains comparisons of Arabidopsis, rice,and poplar.

PLAZA

Access point for plant comparative genomics centralizing genomic data produced by different genome sequencing initiatives. PLAZA integrates plant sequence data and comparative genomics methods and provides an online platform allowing to perform evolutionary analyses and data mining within the green plant lineage (Viridiplantae).

CoGe

A comparative genomics platform designed to allow easy access to genomic data from any organism and provide analysis tools for finding and comparing homologous sequences from multiple genomic regions.

Positional history of A. thaliana genes

Archived data set showing the chromosomal positional histories of Arabidopsis genes. This dataset accompanied the paper Woodhouse MR, Tang H, Freeling M (2011) Different gene families in Arabidopsis thaliana transposed in different epochs and at different frequencies throughout the rosids. The Plant Cell 23(12): 4241-4253. http://dx.doi.org/10.1105/tpc.111.093567

Plant Promoter and Regulatory Element Resources

AGRIS

Currently contains two databases, AtcisDB (Arabidopsis thaliana cis-regulatory database) and AtTFDB (Arabidopsis thaliana transcription factor database).

AthaMap

A genome-wide map of putative transcription factor binding sites in Arabidopsis thaliana.

AtProbe

The Arabidopsis thaliana promoter binding element database, an aid to find binding elements and check data against the primary literature.

DATF: Database of Arabidopsis Transcription Factors

The Database of Arabidopsis Transcription Factors (DATF) contains known and predicted Arabidopsis transcription factors with sequences and many other features including 3D structure templates, EST expression information, transcription factor binding sites and Nuclear Location Signals.

ATCOECIS

This resource can be used to query co-expression data, GO and cis-regulatory elements annotations, submit user-defined gene sets for motif analysis for Arabidopsis and provides an access point to unravel the regulatory code underlying transcriptional control in Arabidopsis.

DoOP: Databases of Orthologous Promoters

A database containing orthologous clusters of promoters from Homo sapiens, Arabidopsis thaliana and other organisms.

PlantCare

Database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences.

PlantProm DB

Database with annotated, non-redundant collection of proximal promoter sequences for RNA polymerase II with experimentally determined transcription start sites (TSS) from various plant species.

Place

Database of motifs found in plant cis-acting regulatory DNA elements, all from previously published reports. It covers vascular plants only.

PlantTFDB: Plant Transcription Factor Database

An integrative plant transcription factor database that provides a web interface to access large (close to complete) sets of transcription factors of several plant species, currently encompassing Arabidopsis thaliana (thale cress), Populus trichocarpa (poplar), Oryza sativa (rice), Chlamydomonas reinhardtii and Ostreococcus tauri.

ppdb (Plant Promoter DB)

Database that provides transcription start sites (TSS) and other structural information for Arabidopsis and rice promoters.

PRI-CAT

Plant Research International ChIP-seq analysis tool is a web-based workflow tool for the management and analysis of ChIP-seq experiments. Users can directly submit their sequencing data to PRI-CAT for automated analysis.

Transfac

Database on eukaryotic transcription factors, their genomic binding sites and DNA-binding profiles. Commercial site.

Proteome Resources

Links to proteome analysis tools and repositories.

Database Searches

...

Nucleotide and Protein Databases

Entrez

NCBI's Entrez Databases -retrieve sequences and other data, including literature, from PubMed.

UNI-PROT

UniProt reflects a merge of 3 databases, SwissProt,PIR and TrEMBL and replaces these databases.Search UniProtKB, a database of curated protein sequences (formerly Swiss-Prot).

PDB

Protein Data Bank, the repository for the processing and distribution of 3-D macromolecular structure data.

miRBase

micro-RNA database for micro-RNA sequences from more than 270 species, including A. thaliana.

BLAST servers

TAIR BLAST

Search against all public Arabidopsis sequences, several subsets of them, or all higher plant sequences from GenBank. These datasets can be downloaded.

NCBI BLAST

BLAST server at NCBI

BLAST help

BLAST manual and user guide from NCBI