简介:
Inparanoid是一个寻找物种间直系同源基因的软件,同时相应的网站目前包含了100 organisms, 1687023 sequences。
Inparanoid单机版程序是一款非常优秀的寻找直系同源基因(orthologs)的工具,目前已经开发到4.1版本,可以在线获取(http://software.sbc.su.se/cgi-bin/request.cgi?project=inparanoid)。
安装:
inparanoid可以直接通过perl inparanoid调用,但需要装好blastall(现今blast+ 未测试)和XML::Parser perl模块
1.安装blastall
Ubuntu 下 : sudo apt-get install blast2 (默认安装blast-2.2.26 这是blastall最新版本)
注意:blastall 需要安装有 -C 参数的版本,据我所知blast-2.2.26版本有,可能其他版本没有如blast-2.2.9
2.安装XML::Parser a.下载tar包 XML-Parser-2.41.tar.gz
b.解压:tar -zxvf XML-Parser-2.14.tar.gz 得到XML-Parser-2.14
c. 安装XML-Parser-2.14.tar.gz的依赖模块Expat(XML-Parser-2.14 已带):[ 依次运行下列命令 ]
cd ./XML-Parser-2.14/Expat
perl Makefile.perl ###生成Makefile配置文件
make ###编译
make install ###安装
可能报错:
Expat.xs:12:19: fatal error: expat.h: 没有那个文件或目录
#include <expat.h>
^
compilation terminated.
解决办法:sudo apt-get install libexpat-dev
d:安装XML::Parser模块:[ 依次运行下列命令 ]
cd ../ ###进入 XML-Parser-2.14目录
perl Makefile.perl
make
make install
安装好后检查XML::parser默认安装目录是非位于perl 的@INC中(使用perl -V 查看@INC),若其目录不在@INC包含之列会报错:
Can’t locate XML/Parser.pm in @INC (you may need to install the XML::Parser module) (@INC contains: /etc/perl /usr/local/lib/perl/5.18.2 /usr/local/share/perl/5.18.2 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.18 /usr/share/perl/5.18 /usr/local/lib/site_perl .) at xmlfilter line 7.
解决办法: 通过在blast_parser.pl中单独指定XML::Parser的路径,如下:
安装XML::Parser模块时:
#######################################################################
yangl@yangl:~/下载/XML-Parser-2.41$ make install
make[1]: 正在进入目录 ‘/home/yangl/下载/XML-Parser-2.41/Expat’
make[1]: 正在离开目录 ‘/home/yangl/下载/XML-Parser-2.41/Expat’
Files found in blib/arch: installing files in blib/lib into architecture dependent library tree
Installing /home/yangl/perl5/lib/perl5/x86_64-linux-gnu-thread-multi/XML/Parser.pm
Installing /home/yangl/perl5/lib/perl5/x86_64-linux-gnu-thread-multi/XML/Parser/LWPExternEnt.pl
Installing /home/yangl/perl5/lib/perl5/x86_64-linux-gnu-thread-multi/XML/Parser/Encodings/iso-8859-4.enc
Installing /home/yangl/perl5/lib/perl5/x86_64-linux-gnu-thread-multi/XML/Parser/Encodings/windows-1250.enc
…….
#######################################################################
@INC为:
########################################################################
yangl@yangl:~/下载/XML-Parser-2.41$ perl -V
……..
@INC:
/etc/perl
/usr/local/lib/perl/5.18.2
/usr/local/share/perl/5.18.2
/usr/lib/perl5
/usr/share/perl5
/usr/lib/perl/5.18
/usr/share/perl/5.18
/usr/local/lib/site_perl
##################################################################
安装目录不在@INC中,所以需要在blast_parser.pl中单独指定XML::Parser的路径:
###################################################################
yangl@yangl:~/下载/inparanoid_4.1$ vi blast_parser.pl
#!/usr/bin/perl
use strict;
use warnings;
#use lib ‘/afs/pdc.kth.se/home/k/krifo/vol03/domainAligner/Inparanoid_new/lib64’;
#use lib ‘/afs/pdc.kth.se/home/k/krifo/vol03/domainAligner/Inparanoid_new/lib64/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi’;
use lib ‘/home/yangl/perl5/lib/perl5/x86_64-linux-gnu-thread-multi’; ##指定XML::Parser的路径
use XML::Parser;
…….
###################################################################
如果fasta的描述信息过长可能会出现一下问题:
###################################################################
fasta:
>OS01T0100200-01 pep:known chromosome:IRGSP-1.0:1:11218:12435:1 gene:OS01G0100200
transcript:OS01T0100200-01 description:”Note\x3dConserved hypothetical protein., Transcript_evidence\x3dAK059894 (DDBJ, Bes
t hit), ORF_evidence\x3dB8ACR2 (UniProt), NIAS_FLcDNA\x3d006-208-E01,”
MEEAGERDADETHAWSGTASPAALWKTVASSAAMLKLALAMISAAFRTTPFSMSMQLCPN
ATMSLHSPSIFDVVSSITPIMSCIINNRLVAEKAGATMQRWRAHSSPSAMTRPLPNMGMR
LSSYDIVCQLAHLHFSHVCCLV
报错:
Starting second BLAST pass for Oryza_sativa.2M.faa – Oryza_sativa.2M.faa on 2014年 09月 22日 星期一 08:55:26 CST
[formatdb] WARNING: Cannot add sequence number 1 (lcl|1_./tmpd) because it has zero-length.
[formatdb] FATAL ERROR: Fatal error when adding sequence to BLAST database.
[blastall] WARNING: Unable to open tmpd.pin
[blastall] WARNING: Unable to open tmpd.pin
[blastall] WARNING: “: Unable to open tmpd.pin
[blastall] WARNING: “: Unable to open tmpd.pin
[blastall] WARNING: “: Unable to open tmpd.pin
[blastall] WARNING: “: Unable to open tmpd.pin
[blastall] FATAL ERROR: “: Database ./tmpd was not found or does not exist
no element found at line 1, column 0, byte -1 at ./blast_parser.pl line 111.
办法:去掉fasta的描述部分,程序见:faTruncate.pl .
#! /usr/bin/perl use strict; use warnings; ############################################################### # Author: yangl # Date: 2014.9.9 # Description: This program is to truncate the description part of fasta file ############################################################### my $usage = "\tDescription: This program is to truncate the description part of fasta file.\n\tUsage: $0 \n"; if(!@ARGV){ print STDERR $usage; exit 1; } open IN,"<","$ARGV[0]"; open OUT,">","$ARGV[0].truncate"; while(){ s/^(\>.+?)\s+.*\n$/$1\n/; print OUT "$_"; } close OUT; =cut while(){ if(/^(\>.+?)\s+.*\n$/){ print "$1\n"; }else{ print "dismatch\n"; } } =cut close IN;
##############################################################
运行:perl inparanoid A.pep B.pep
程序输出:
# Output options: #
$output = 1; # table_stats-format output #
$table = 1; # Print tab-delimited table of orthologs to file “table.txt” #
# Each orthologous group with all inparalogs is on one line #
$mysql_table = 1; # Print out sql tables for the web server #
# Each inparalog is on separate line #
$html = 1; # HTML-format output #
注:部分内容来自 http://www.plob.org/2011/12/11/905.html
尊重他人劳动成果,转载请注明出处:Bluesky's blog » Inparanoid:寻找物种间直系同源基因的软件
你好,我最后出现您文中最后的报错信息,Cannot add sequence number 1 (lcl|1_./tmpd) because it has zero-length
可以将您的解决问题的程序faTruncate.pl 发给我吗?
#! /usr/bin/perl
use strict;
use warnings;
###############################################################
# Author: yangl
# Date: 2014.9.9
# Description: This program is to truncate the description part of fasta file
###############################################################
my $usage = "\tDescription: This program is to truncate the description part of fasta file.\n\tUsage: $0\n";
if(!@ARGV){
print STDERR $usage;
exit 1;
}
open IN,"<","$ARGV[0]"; open OUT,">","$ARGV[0].truncate";){ ){
while(
s/^(\>.+?)\s+.*\n$/$1\n/;
print OUT "$_";
}
close OUT;
=cut
while(
if(/^(\>.+?)\s+.*\n$/){
print "$1\n";
}else{
print "dismatch\n";
}
}
=cut
close IN;
我把代码也加在在博客里面了!
谢谢。
你好,我为什么下载了InParanoid,但是没找到安装包啊
我要怎么安装呢?麻烦您了
inparanoid 就是一个perl脚本,你看到其中有一个inparanoid.pl脚本就是