Home   |   Contact
Virtual Screening Performance
Lead Finder's performance in virtual screening was assessed on a set of 34 protein targets. These structures were chosen on the basis of:
  • protein relevance for drug discovery research;
For each protein target a set of active ligands was extracted from the following public sources: PDB database1, KiBank2, active ligands exposed by Surflex developers 3. A set of 1904 decoy ligands used in all current virtual screening studies was borrowed from the original Surflex publication4, available for download from Surflex developers site3.
Note, that VS-score is different from ranking score (used for pose ranking during ligand docking) anddG-score (used for binding energy estimations for the docked ligand poses). VS-score is specially parameterized to rank-order compounds during virtual screening studies (for details of VS-score and other scoring functions used in Lead Finder see Scoring Functions section).
Benchmarking experiments consisted of creating a test library of compounds (obtained by mixing active ligands with the set of decoy compounds), docking each compound from the library to each particular target, and rank-ordering compounds according to the calculated VS-score. For quantitative characterization of virtual screening efficiency two well recognized parameters were used: area under the so-called receiver operating curve (ROC) and enrichment factor (EF). Results of virtual screening experiments are presented in the table below.
Area under ROC plot (or simply ROC) is an integral parameter of virtual screening performance and corresponds to the area under the curve built according the rule: for a given fraction of screened library the Y-coordinate denotes the fraction of active compounds found, the X-coordinate represents the fraction of the inactive ligands (decoys) compounds. An ideal curve reflects 100% of true actives found, and 0% of decoys; this ideal curve returns ROC=1

Protein target

PDB id

ROC

EF40

EF70

Number of ligands

dG, kcal/mol

Beta-secretase

1m4h

0.98

16.9

16.3

40

-10.8

HIV-1 protease

1pro

0.98

13.2

13.4

50

-11.5

Factor Xa

1fjs

0.98

12.8

11.4

50

-9.7

Estrogen receptor antagonists

3ert

0.97

23.8

15.2

30

-11.6

Ribonuclease A

1qhc

0.95

12.9

8.9

30

-9.0

Epidermal growth factor receptor kinase

1m17

0.95

7.3

8.1

50

-9.4

cAMP-dependent protein kinase

1fmo

0.94

6.2

6.6

50

-10.3

Urokinase-type plasminogen activator

1gj7

0.94

6.9

7.3

20

-9.2

p38 MAP kinase

1kv2

0.92

4.2

5.4

50

-10.8

Acetylcholinesterase

1e66/1eve

0.91

4.3

5.1

30

-8.2

HSP90

1uy6

0.89

3.9

4.5

30

-8.6

Lck kinase

1qpe

0.87

4.6

3.8

40

-8.3

Estrogen receptor agonists

1l2i

0.86

2.3

2.7

30

-9.3

Vascular endothelial growth factor receptor kinase 2

2oh4

0.86

4.0

3.7

50

-8.9

Thermolysin

4tmn

0.86

16.6

3.7

20

-9.5

Neuraminidase

2qwg

0.84

3.7

2.6

30

-7.3

Thymidylate synthase

1f4g

0.77

3.2

2.3

15

-8.7

Progesteron receptor

1sr7

0.76

2.1

2.0

20

-10.4

Oligopeptide-binding protein

1b5j

1.00

89.4

78.3

16

-15.0

Orotidine-5’-P decarboxylase

1eix

0.99

64.0

26.1

18

-11.0

Protein tyrosine phosphatase 1B

1c84

0.99

55.4

11.7

20

-9.5

Peroxisome proliferator activated receptor gamma

1fm9

0.98

11.8

11.7

50

-11.9

Ribonuclease T1

1rnt

0.97

74.1

35.4

10

-8.5

Thrombin

1c4v

0.96

8.6

10.8

40

-10.2

Trypsin

1qbo

0.95

9.4

9.5

20

-10.6

Thymidine kinase

1kim

0.94

27.8

20.5

10

-8.9

Mineralocorticoid receptor

2aa2

0.94

5.8

10.4

10

-11.2

Poly(ADP-ribose) polymerase

1efy

0.92

5.7

7.6

10

-7.5

Penicillopepsin

1bxo

0.91

8.2

5.5

6

-10.3

Cyclooxygenase-2

1cx2

0.91

4.0

5.1

50

-11.3

Fibroblast growth factor receptor kinase

1fgi

0.86

3.0

3.6

50

-9.6

Angiotensin-converting enzyme

1o86

0.83

2.7

3.8

20

-9.4

Glucocorticoid receptor

3bqd

0.82

2.5

2.7

50

-10.3



Enrichment factor is calculated for a certain percentage of active compounds, and is defined as the fraction of active compounds found divided by the fraction of the screened library. For example, EF70 denotes enrichment factor at 70% of active ligands found, EF100 — enrichment at 100% of actives found.
Two protein models (PDB structures 1e66 and 1eve) for acetylcholine esterase were used because of overlapping of some active ligands with phenylalanine 330, which can adopt two distinct conformations.
Lead Finder has been able to achieve significant enrichments for almost all protein targets. The most outstanding enrichments were obtained for oligopeptide-binding protein (OppA), orotidine-5´-phosphate decarboxylase (OPD), protein tyrosine phosphatase 1B (PTP1B), beta-secretase, HIV-1 protease, factor Xa, peroxisome proliferator activated receptor gamma (PPAR-γ). Analysis of docked ligand poses revealed that for OppA and OPD Lead Finder correctly found multiple hydrogen bonds and complementary electrostatic interactions of charged groups, which yielded high enrichments for those targets. For PPAR-γ, HIV-1 protease and beta-secretase the basis for impressive results was in correct detection of crucial hydrogen bonds and hydrophobic contacts between active ligands and corresponding proteins. Interestingly, that PPAR-γ was traditionally recognized as a difficult target5,6,7 and even two protein models were used for virtual screening studies of PPAR-γ5, however Lead Finder was able to achieve excellent results using a single protein structure model. Successful docking of positively charged fragment of factor Xa inhibitors gave rise to high enrichment for the corresponding target.
Nuclear receptors demonstrated from moderate (for glucocorticoid receptor (GR) and progesterone receptor (PR)) to high (for estrogen receptor and mineralocorticoid receptor) enrichments. Close analysis of GR and PR reveals that their ligand binding cavities are too spacious and can easily accommodate ligands of diverse shape and size, giving rise to false positives.
Thymidylate synthase also revealed quite modest enrichment. In this case we attribute results to insufficiently specific binding of its native ligands (predicted average binding energy over active ligands comprised -8.7 kcal/mol only), which is probably stipulated by relatively shallow and surface exposed binding site of the enzyme. Probably, similar situation takes place for neuraminidase, which binding site represents extended funnel; ligands have too much freedom to slide along protein surface, which dilutes the specificity of binding. As in the case with TS, native ligands of neuraminidase reveal quite modest binding energy (-7.3 on average).
Finally, protein kinases deserve special attention as they are intensively studied as drug targets nowadays8. Currently, 7 protein kinases were assessed by Lead Finder in virtual screening experiments. Overall, kinases revealed good enrichment from ROC = 0.86 for fibroblast growth factor receptor kinase (FGFR) and vascular endothelial growth factor receptor kinase 2 (VEGFR) to ROC = 0.95 for epidermal growth factor receptor kinase (EGFR) and ROC = 0.96 for tyrosine kinase C-SRC (SRC). Close examination of predicted poses for active kinase ligands revealed common feature determining the degree of virtual screening success: targets, for which active ligands succeeded forming crucial correlated hydrogen bonds with hinge fragment of a kinase, demonstrated higher enrichment than targets for which these hydrogen bonds were observed less frequently. We also found that relative mobility of N- and C-terminal lobes of a catalytic kinase domain influences accessibility of correct ligand-binding pose. For this reason evaluation of multiple protein conformers is suggested to achieve high enrichment results in case of kinases.

You can download currently obtained benchmarking data, including: description of virtual screening experiments, structures of protein targets, active and decoy ligands, output results provided by Lead Finder.
 
E-mail: info@biomoltech.com |  Phone: +1(416)238-1263 |  Fax: +1(416)352-6117
Mailing address: 226 York Mills Rd, Toronto, Ontario M2L 1L1, Canada
Privacy Policy | Terms of Use
© 2016 BioMolTech Corp.