Skip to main content

Table 2 Online repositories tailored for sORF identification

From: Short open reading frames (sORFs) and microproteins: an update on their identification and validation measures

Database

References

Website

Type

Description

sORFs.org

[14]

http://www.sorfs.org

sORF repository

Obtains experimental data from RIBO-seq with conservation analyses and rescanning MS data from PRIDE for updated small peptide validation

SmProt

[109]

http://bioinfo.ibp.ac.cn/SmProt/

sORF repository

Database on small proteins specifically from lncRNA, obtains data from RIBO-seq, literature mining and MS data, integrates conservation analyses

OpenProt

[111, 112]

https://www.openprot.org/

altORF resource

Contains information on protein isoforms and altORFs with experimental evidence, intergrates RIBO-seq, MS, conservation analyses and functional domains

ARA-PEPs

[108]

http://www.biw.kuleuven.be/CSB/ARA-PEPs

sORF repository

Repository of putative sORF-encoded peptides specifically in Arabidopsis thaliana, data obtained from in-house Tiling arrays and RNA-seq data

PsORF

[107]

http://psorf.whu.edu.cn/

sORF repository

Database of sORF across different plant species, incorporating genomic, transcriptomic, RIBO-Seq and MS data

MetamORF

[110]

http://metamorf.hb.univ-amu.fr/

sORF repository

A repository of unique sORFs in H. sapiens and M. musculus genomes by experimental and computational methods

nORFs.org

[113]

https://norfs.org/

novel ORF (nORF) repository

Provides aggregated information from databases such as sORFs.org, OpenProt and OpenCB

  1. This table shows the databases available publicly for sORF identification. sORFs.org and OpenProt evaluate protein sequence identity based on BLASTp score, whereas SmProt provides a BLAST alignment search for manual evaluation of protein sequence identity. OpenProt annotates sORFs but under the label of altORFs that are longer than 30 codons and originating from ncRNAs, pseudogenes or has multiple ORFs per transcript, hence the limits set during search identification should be noted. ARA-PEPs were developed specifically from A. thaliana sORF experimental data, and PsORF aimed to store a more complete record of plant sORF. A large bulk of both MetamORF and nORFs.org data was obtained from sORFs.org and OpenProt. nORFs.org provides additional protein sequence viewer, OpenCB variants and customises annotation metrics functions