Introduction

Peptide therapeutics has become a major field of biomedical and pharmaceutical research1. The underlying reason is that the peptides as therapeutic agents are better than the chemical drugs in providing greater safety, target specificity and potency1,2,3,4,5. The peptides have reduced side effects and do not accumulate in the body. Nowadays peptides are being used as drugs in case of different diseases such as multiple sclerosis, prostate cancer, endometriosis, acromegaly, etc.1. Peptides containing various therapeutic properties have been discovered6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 and their number is increasing with time21. Owing to their applicability, a number of bioinformatics platforms have been developed to assist peptide therapeutics22,23,24,25,26,27,28. According to a recent report, 128 peptides are in the clinical pipeline. Of these 128 peptides, 74 are in Phase II and 14 in Phase III clinical trials29. Peptides like INGAP for diabetes, N-acetyl-aspartyl-glutamate for geriatric depression and GO-203–2C, p28 (CPP) and CDCA1 for cancer treatment are in clinical trials (https://clinicaltrials.gov). Inspired by the potential applications, peptide therapeutics have been projected as a billion dollar market4,5. For example only in the year 2011, United States approved 25 therapeutic peptides that had global sales of 14.7 billion US dollars29.

Despite numerous advantages, there are a few challenges associated with therapeutic peptides that obstruct the way of the maximal benefits of peptides. These challenges include high production cost, low storage stability and suboptimal in vivo half-life5. The technological developments promise to increase the production and the storage stability4,5. The suboptimal in vivo half-life remains a challenge as the short half-life of a peptide reduces its bioavailability that is required for its optimal function30. Thus, it is imperative to focus on designing of peptides with optimal half-life to ensure their optimal action.

In order to address this important topic, a large number of experimental studies have been dedicated to improve and optimize the half-life of peptides31,32,33,34. Although the data from these studies are very useful, they are scattered in the literature and therefore, are difficult to access and use. With the purpose of providing assistance to the scientific community, in this study an attempt has been made to develop a platform ‘PEPlife’ to provide data related to the half-life of peptides at a single source. We have also incorporated various tools and modules in PEPlife to assist users in searching, comparing and analyzing the peptides, their half-lives and the related details. We hope that PEPlife will be helpful for the scientific community to design peptides with optimal stabilities.

System and Methods

Data Collection

The data was manually collected and curated from published research articles and patents. Only those peptides were included in the database, whose half-life was experimentally determined. We queried PubMed to search for research articles and The Lens for patents. The query ‘(peptide[Title/Abstract] AND half-life[Title/Abstract])’ was used to retrieve articles relevant to half-life of peptides from PubMed. It resulted in ~2280 articles as on November 2015. During the initial screening, the articles lacking relevant information and reviews were excluded. Around 900 potential papers were scrutinized to mine the required fields. Finally, data was systematically curated from 335 articles. Similarly, full-text of granted patents were obtained from The Lens and manually screened to filter the patents with relevant information for data curation. We also collected relevant information about FDA approved peptide drugs from DrugBank35 and related literature.

In PEPlife, we have systematically compiled comprehensive information about each peptide. The information includes the peptide’s name, sequence, length, terminal and non-terminal modifications, biological property, assay used to determine the half-life of peptide. To maintain complete information, we made multiple entries for the same peptide if its bioactivity or half-life was tested using different concentrations, conditions, routes of administration, etc. This complete information, thus, highlights the influence of these subtle conditions on the half-life of peptides.

Database Architecture and Web Interface

PEPlife was built using Apache HTTP server on Linux Platform. MySQL an object-relational database management system (RDBMS) was used to manage all the data in the backend. It allows easy retrieval and storage of the data in the database. HTML, CSS, PHP and JavaScript were used to develop the front-end web interface. The architecture of PEPlife is represented in Fig. 1.

Figure 1
figure 1

Architecture of PEPlife database.

Database Content

The information in PEPlife can be categorized into two broad types: primary information and secondary information. The primary information has been curated from the literature and consists of the following major fields namely: (i) PMID, (ii) the peptide sequence, (iii) the name of peptide, (iv) the length of peptide, (v) N-terminal modification, (vi) C-terminal modification (vii) configuration (linear or cyclic) (viii) chirality of the amino acids, (ix) chemical modification, (x) origin of the peptide, (xi) biological activity of the peptide, (xii) half-life, (xiii) assay types, (xiv) sample on which the half-life was tested and (xv) Patent ID.

The secondary information, which was derived from the primary information, includes the tertiary structures of peptides. To obtain the structural information, all the peptides of PEPlife database were searched and mapped to the peptide sequences in the Protein Data Bank (PDB)36. After obtaining the exact sequence match in PDB, the structure same as the match was assigned to the query peptide. Using this approach, we determined the structures of 265 peptides. In the cases where identical peptide sequences were not available, we predicted the structures of the peptides using PEPstrMOD37 which is updated and advanced version of PEPstr38, containing natural and modified residues38,39,40,41. Due to the unavailability of force-field libraries for complex chemical modifications (e.g., pegylation, penicillamine, etc.), the structures of the peptides containing such modifications were not predicted. The structures of a total of 36 peptides, which had amino acids lesser than five residues, could not be predicted by using the software mentioned above. Therefore, in this case where a peptide has less than five residues, we used a linear conformation with dihedral angles (ϕ and ψ) as 180°. The initial structure was subjected to energy minimization and molecular dynamics simulation. The trajectory of the whole simulation was searched for the conformation that had minimum energy. The minimum-energy-structure was considered as the final predicted structure. The structures of 132 peptides having more than 40 residues were predicted using I-TASSER web service42. The tertiary structures of the peptide entries from DrugBank were also predicted in the same way. Out of 29 peptide entries from DrugBank, 6 entries were mapped to PDB, 9 entries were predicted using PEPstrMOD, and the rest 14 entries with modified residues were not predicted.

The secondary structures of all peptides were assigned using DSSP software from their tertiary structure43. DSSP assigns the secondary structure into eight different states (B: beta-bridge; C: loop; E: extended strand; G: 3/10 helix; H: alpha-helix; I: pi-helix; S: bend and T: turn). The secondary structure analysis of the predicted structures (except DrugBank peptides) revealed that the peptide residues frequently belong to loop regions (~32%), followed by helix (~30%), turns (~17%) and bends (~19%). Only a few peptide residues were observed in strand regions (~2%). The predicted peptide structures were also converted into SMILES notation using Open Babel software44.

Implementation of Web Tools

A number of tools have been integrated for data retrieval, similarity search and data analysis; following is the brief description of different options available in PEPlife.

Search Tools

We have incorporated 4 different modules under the Search option to facilitate easy retrieval of data: Simple, Advanced, Peptide and SMILES. In the Simple Search, the website facilitates users to search peptides according to any of the fields in the database. In this option, users can also select the fields to display in the results. In the case of the Advanced Search, users can perform complex and multiple queries for extracting desired entries from the database. This option allows the use of standard logical operators (“=”, “>”, “<” and “LIKE”). A user can combine the outputs of different queries using operators like “AND & OR”. The Peptide Search tool searches exact as well as substring matches of a given peptide sequence among the peptide sequences available in PEPlife. We have also maintained structures of the peptides in SMILES format to assist users to understand the property of peptides at atom/bond level. The SMILES Search facilitates users to search a query peptide in SMILES format present in our database.

Browsing Tools

In PEPlife, we have provided a simple yet thorough class-wise browsing facility, in which all the peptide-entries have been categorized into different classes. In this module, the information related to a peptide can be browsed using following seven criteria (i) Half-life, (ii) Organism and Media (where the half-life was tested), (iii) Peptide Length (iv) Publication Year (v) Type of Modification (vi) Type of Assay (used to detect the half-life) and (vii) DrugBank peptides.

Analysis Tools

PEPlife is studded with a number of web-based tools for performing various sequence and structure analyses of query peptides. BLAST, Smith-Waterman and GGSEARCH tools allow the user to perform similarity search of their query peptide against PEPlife database. The Peptide Mapping tool facilitates users to perform sub-search and super-search against PEPlife peptides. The Sequence Alignment tool allows the user to align query peptide with only user-selected peptides from PEPlife by using their PEPlife-IDs. The Structure Alignment tool aligns the structure of query peptide with the structure of the chosen peptide from PEPlife. The PDB file of the query peptide and PEPlife-ID of the subject peptide is submitted to perform the structural alignment.

Data Statistics

PEPlife harbors total 2229 entries containing relevant information about 1193 unique peptides and their half-lives. 2066 entries were collected from 335 published articles, 134 entries from 16 patents and 29 entries of from DrugBank. A significant portion of the entries (~833 entries) belongs to the period from the year 2005 to the year 2009. These peptides have been reported to show a diversity of biological behaviors, such as anticancer, antiviral, antimicrobial, antibacterial, neurotransmitter, erythropoiesis-stimulant, anticoagulant, antihyperglycemic, insulinotropic, antihypertensive, etc.

Peptides in PEPlife have different conformations, amino acid configurations and lengths (Fig. 2A). The examples of the effects of modifications on the half-life of peptides are given in Fig. 3 and Supplementary Table S1. Peptides in 245 entries are cyclic. The peptides in 213 entries have peptides with mixed (i.e., containing both L and D) amino acid configurations. Lengths of the peptides vary from less than five amino acids to more than 35 amino acids. Peptides with the lengths from six to ten amino acids have the maximum number of entries (571 entries), followed by the peptides having >35 amino acids (462 entries). Peptides composed of 21 to 25 amino acids are present in the least number of entries (57 entries) (Fig. 2B).

Figure 2
figure 2

Distribution of peptides based on (a) conformation and configuration of amino acids, (b) length, (c) modifications and (d) assays used to measure half-life.

Figure 3
figure 3

Examples to show the effects of modifications on the half-life of peptide analogues.

(a) Cyclization of peptides-The half-life of Lcf1 (RRWQWR) increases on head to tail cyclization of the same sequence in its analogue Lcf5. (b) Incorporation of D-amino acids- KSL7 (kKVVFKVKFk) with 2 D-amino acids has a longer half-life than KSL (KKVVFKVKFK). (c) Addition of terminal modifications- Lcf3 (CH3CO-RRWQWR) with N-terminal modification and Lcf4 (CH3CO-RRWQWR-NH2) with both N & C-terminal modifications have longer half-life than Lcf1 (RRWQWR) with no terminal modifications. (d) Non-natural amino acid substitution- O-6 (VDKPPYLPRPRPPRRIYN-Orn) and O-9 (VDKPPYLPRPRPPRRIYN-Nmr) with non-natural amino acids have longer half-life than O-5 (VDKPPYLPRPRPPRRIYNH).

In order to increase the half-life of peptides, various modifications have been incorporated in the peptides. Among them, most of the modifications have been done at the termini (Fig. 2C). The maximal number of N-terminal modifications include the addition of 2,4-dichlorophenoxyacetic acid with (CH2)n-spacers, followed by acylation. Besides, PEGylation, glycosylation, succinylation, addition of human serum albumin (HSA) and hydroxylation have been utilized as important N-terminal modifications to improve the half-life of peptides. Amidation is the most used C-terminal modification followed by biotinylation with PEG (polyethylene glycol) spacers. The other C-terminal modifications include additions of Human serum albumin (3 entries), cholesterol (2 entries), PEG (24 entries), XTEN (4 entries), Fc-region (14 entries), etc. These non-terminal modifications include methylation (27 entries), addition of fatty acid chains (7 entries), addition of carbohydrate chains (2 entries), reduced amide bonds (33 entries) and reduced carbamate bonds (16 entries) etc. and incorporation of non-natural amino acids like biphenylalanine (Bip) (36 entries), pyroglutamic acid (pGlu) (27 entries), sarcosine (Sar) (45 entries), ornithine (Orn) (12 entries), norleucine (Nle) (4 entries), etc.

The entries in PEPlife show a number of in vivo (948 entries) and in vitro (1265 entries) methods used to assess the half-life of peptides. These methods include mass spectrometry, immunoassays, radiolabeling, spectroscopy and various other assays. Some of the favored assessment methods include HPLC (540 entries), radioimmunoassay (335 entries) and ELISA (91 entries) (Fig. 2D).

Discussion

The half-life of a peptide determines its bioavailability to the organism; a peptide having therapeutic advantages should also possess optimal bioavailability to be used as a drug. The short half-life of a therapeutic peptide can lead to the less bioavailability. Despite the significant relevance of half-life in bioavailability, so far no platform is available which covers a broad variety of information related to the half-life of peptides. However, few bioinformatics platforms predict the half-life of specific peptides in the specific environments only31,45. Moreover, these platforms do not contain a wide range of information. Therefore, it is evident that there is a need for a database that has a broad scope and usage in peptide half-life improvement. In this report, we have created a database as an attempt to fulfill the lacuna and to provide a repertoire of information related to the half-life of peptides having a variety of properties and modifications. The database also covers the variations observed in the half-life according to different environments, organisms and different routes of administration.

The half-life of a peptide depends on both, the organism and the peptide. A number of in vitro and in vivo studies have been done to understand the relationship between the half-life of peptides and their sequences, structures, modifications; host organisms; and drug administration routes in the host organism. The factors which affect the enzymatic degradation and the pharmacokinetics of a peptide in an organism play crucial roles in deciding the stability of that peptide46,47. Apparently the factors that lower the enzymatic degradation and metabolism of a peptide tend to stabilize the peptide48,49. Different organisms have different pharmacokinetics and different extent of proteolysis of a peptide, leading to a difference in its half-life46,47. Moreover, different individuals of the same species can have variable pharmacokinetics of the same peptide, leading to the variable half-life of the peptide47.

The significant factors affecting half-life include the sequence of a peptide, modifications, administration routes, and the amount of the peptide (dose). It is observed that the sequence variants of a peptide have different half-lives. Chemical modifications also alter the half-life49. To achieve improved half-life, the inclusion of chemical modifications such as the use of D-amino acids, non-natural amino acids (e.g., ornithine), PEGylation and N and C-terminal modifications have been extensively employed. A peptide administered in different organisms via the same route has different half-lives49. Furthermore, different administration routes also affect the half-life of a peptide50. Clearly, all the mentioned details are necessary to improve the half-life of therapeutic peptides. For this reason, it is essential to store such details at one platform for their easy access and use.

PEPlife is a repository of valuable information related to the stability of peptides. It harbors extensive and systematic cataloging of the data related to the half-life of peptides and the affecting factors. This information can prove indispensable for the rational design of peptides of therapeutic importance. To add further advantages, a number of tools have been provided in the database to facilitate the extraction and analysis of the compiled information. We anticipate that PEPlife will be helpful not only to satisfy half-life queries but also to understand the properties of peptides that govern their half-lives.

In the future, various interesting studies can be done using the data of PEPlife. Some of them can be as follows: (i) structures available in PEPlife can be used for docking and various membrane simulations studies, (ii) the dataset of PEPlife can be used for development of various prediction methods for peptide half-life, and, (iii) the SMILES of PEPlife can be used to develop QSAR models. We hope that PEPlife will be a useful resource for researchers working in the area of designing of therapeutic peptides.

Update of PEPlife

We will update PEPlife at regular intervals to further widen the coverage of half-life of peptides reported in literature. PEPlife also provides the users an option to submit new entries of peptides and their half-life on its web interface by filling an HTML form. Our team will confirm the validity of each new entry before incorporating into PEPlife in order to maintain a high level of quality.

Limitations

We have made an attempt to cover as much information as possible related to half-life of peptides by manual curation, though it is possible that a few articles might not be incorporated that could not be fetched with our search criteria. We have provided structural information of most of the peptides but due to unavailability of force-field libraries of complex modification of peptides, a few structures of peptides could not be predicted.

Additional Information

How to cite this article: Mathur, D. et al. PEPlife: A Repository of the Half-life of Peptides. Sci. Rep. 6, 36617; doi: 10.1038/srep36617 (2016).

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.