Bioinformatics

Ruby Bioinformatics code(very basic)

Description:
This code is ported from the Python code in "Beginning Python for Bioinformatics" by Patrick O'Brien. This code has not been heavily tested(only lightly tested), so use at your own risk.

Author's Homepage:
http://gavmacprogramming.wordpress.com/2007/03/29/exploring-bioinformatics-with-ruby/

Script File:
bioinformatics_april.zip (11.8 KB)

mcprimers

Description:
Front end Perl script for MCPrimers.pm. Designs PCR primers for molecular cloning with appropriate site directed mutagenesis. See CPAN for more details.

Author's Email:
slenk@emich.edu

Author's Full Name:
Steve Lenk

Author's Homepage:

Script File:
mcprimers (6.05 KB)

MCPrimers.pm

Description:
Perl module to design PC primers for molecular cloning with appropriate site directed mutagenesis. Use with mcprimers.pl and primer3 - see CPAN for more info

Author's Email:
slenk@emich.edu

Author's Full Name:
Steve Lenk

Author's Homepage:

Script File:
MCPrimers.pm (28.48 KB)

EnTuned.pl

Description:
Entuned.pl reads a newline delimited list of Ensembl numbers from a text file, accesses the ensembl.org website to find the corresponding Entrez and UniGene numbers. the output of the program is a comma delimited text file containing 7 fields (ensembl.org URL for the specific ensemble number, the ensemble number, the URL for the Entrez number, the Entrez number, the URL for the UniGene number, the UniGene number, the description of the gene from the ensembl.org web page). This program converts Enemble numbers to Entrez and UniGene numbers.

Author's Email:
paul_a_wilson@mac.com

Author's Full Name:
Paul A. Wilson, Ph.D., if a title must be used, author prefers motorcyclist over doctor

Author's Homepage:
homepage.mac.com/paul_a_wilson

Script File:
EnTuned.pl.tar.gz (2.74 KB)

Dinucleotide shuffle with Altschul&Erickson Algorithm

Description:
This script is an implementation of the Altschul&Erickson algorithm for exact dinucleotide shuffling.
The following modules should be intalled: Graph, Bio::DB::Fasta, Bio::Seq, Bio::SeqIO. all of them are available at CPAN (http://search.cpan.org)

Author's Full Name:
Diego Mauricio Riaño-Pachón

Author's Homepage:
http://www.geocities.com/dmrp.geo

Script File:
dishuffleseq.pl.gz (3.84 KB)

combine/permute a list from a file or pipe

Description:
combo -[pc]
Perform combinatoric transformations on a list of elements
separated by newline. Input may be a filename, or '-' to
read from STDIN. Combinations/Permutations are written to
STDOUT, one per line, with elements separated by tab.
Options:
-p permute list; this is the default behavior
-c combine list; this parameter requires an integer value
for how many of the list elements should be included
in the combination

Author's Email:
allenday@ucla.edu

Author's Full Name:
Allen Day

Author's Homepage:
http://search.cpan.org/~allenday

Script File:
combo.gz (746 bytes)

extract_genes.pl

Description:
extract_genes.pl - extract genomic sequences from NCBI files using BioPerl. This script is a simple solution to the problem of
extracting genomic regions corresponding to genes. There are other solutions, this particular approach uses genomic sequence
files from NCBI and gene coordinates from Entrez Gene.

Author's Email:
osborne1@optonline.net

Author's Full Name:
Brian Osborne

Author's Homepage:
http://bioperl.org

Script File:
extract_genes.pl.zip (1.76 KB)

Updates NCBI Blast Databases (e.g. for cron job)

Description:
fetch_ncbi_db.pl is a script I wrote to automatically update the blast databases from NCBI. We regularly need to make sure the databases are up to date, so we set up a weekly cron job to download them. It does not check if the files have been updated before downloading them, since the databases we use are updated very regularly.

Author's Full Name:
Alexander Richter

Author's Homepage:

Script File:
fetch_ncbi_db.pl.gz (1 KB)

sequence extraction from PDB file

Description:
the script extracts a sequence from PDB file and saves it in fasta format.
Sequences of different chains are saved as different fasta sequences.
Title of the sequences is >pdb_file:chain, fore example >1JTG:A
The program recognizes DNA or RNA sequence and gives an error
The program recognize short sequences ( up to 60 aa) and gives an error
An output is saved in a file with the same name as an input file + "fasta" suffix

Author's Email:
dana.reichmann@weizmann.ac.il

Author's Full Name:
Dana Reichmann

Author's Homepage:
http://bioinfo.weizmann.ac.il

Script File:
pdb_fasta.pl.zip (1.39 KB)

compsort

Description:
compsort combinatorially scores large data sets. The score is based upon a cumulative relative difference. I used this program to compare the distance space descriptors of the low energy conformations of four molecule to determine the four conformations that were the most similar in three dimensional space. Further description can be found in the comment section at the top of the c source file. I am working on a more gcc 4 compliant version.

Author's Email:
paul_a_wilson@mac.com

Author's Full Name:
Paul A. Wilson

Author's Homepage:
http://homepage.mac.com/paul_a_wilson/

Script File:
compsort.c.tar.gz (8.52 KB)