RESEARCH: DUD-E
FOLDING PROJECT #12217 PROFILE

PROJECT TEAM

Manager(s): Louis Smith
Institution: University of Pennsylvania

WORK UNIT INFO

Atoms: 43,599
Core: 0x22
Status: Public

TLDR; PROJECT SUMMARY AI BETA

This project uses computer simulations to study how proteins interact with drugs. It uses a dataset called DUD-E, which contains information about many different proteins and their interactions with small molecules. The goal is to create high-quality simulations that can be used to improve drug discovery methods.

Note: This TLDR is a simplication and may not be 100% accurate.

OFFICAL PROJECT DESCRIPTION

In this series of projects we are simulating proteins that are part of the DUD-E benchmark data set for protein-ligand interactions, using simulations initialized from Alpha Fold. Simulation methods to study protein-small molecule interactions are of critical importance to the early stages of drug discovery, but most methods have a poor balance of accuracy relative to cost.

Much of the development process for new compounds happens via screening large libraries of compounds for activity against target proteins believed to be relevant for a disease.

Lending focus to this search makes developing new molecules into drugs more economical and faster. In order to do this kind of methods development, good reference data that is widely available is essential.

A classic dataset for benchmarking structural methods attempting to predict protein-ligand interactions known as DUD-E has been widely used because it has diverse proteins, and each protein is bound to a fairly large collection (usually more than fifty) of small molecules for which the ability to bind the receptor have been measured experimentally.

Using Folding@Home, we will create large reference quality simulations of these proteins.

Because we know how such simulations, and the binding methods we or others may test on them, should look and function we have a great yardstick for improving the methods we have and developing new ones. In this project series we have the following systems, many of which are known for their medical relevance in addition to having been extensively studied with both simulation and experiment in the past. 12201 - ACES: Acetylcholinesterase that is critical to nervous system function in animals.

It is the target of pesticides, and also numerous drugs.

If targeted in the correct way, it can reduce neural swelling.

This sequence happens to be from the Pacific Electric ray, Torpedo Californica, which was a landmark discovery in biomedical efforts to isolate neurotransmitter receptors and led to a mechanistic understanding of myasthenia gravis. 12202 - AKT2: serine-threonine kinase taking part in the insulin signal transduction pathway.

Implicated in some cancers, it has been a target of drug development campaigns in the past. 12203 - AMPC: A critical antibiotic resistance gene, it is a beta lactamase capable of opening the critical structural feature of celphalosporin-type antibiotics, rendering them ineffective. 12204 - BACE1: Beta secretase 1, an aspartic acid protease that helps form myelin sheaths in neurons.

It is the major generator of amyloid-beta peptides in neurons, and therefore is implicated in Alzheimer's disease. 12205 - BRAF: B-raf is involved in sending signals involved in cell growth, and as such is considered a proto-oncogen.

It is a serine/threonine kinase that has several known inhibitors, some of which are now anti-cancer medications. 12206 - CASP3: a caspase-type protease that participates in the execution of apoptosis, the process of programmed cell death.

It also acts to cleave one of the amyloid forming proteins and is therefore implicated in Alzheimer's dementia. 12207 - CDK2: one of the cyclin dependent kinases, this protein is a checkpoint kinase that signals transitions between growth and DNA synthesis phases in the cell cycle.

Dysfunction in this checkpoint is associated with cancer; inhibiting CDK2 can arrest cell cycle in cases of abmormal growth, so it has been an anti-cancer target for some time. 12208 - CSF1R: Colony stimulating factor 1 receptor, when bound by cognate ligands, will promote survival, proliferation and differentiation of many myeloid cell types.

It is thus involved in disease and is targeted in therapies for cancer, neurodegenerative diseases, nad inflammatory bone diseases. 12209 - DPP4: Dipeptidyl peptidase-4, a protein that cuts up certain other proteins on the surfaces of most cells.

Important in immune regulation, signal transduction, and apoptosis, molecules inhibiting its enzymatic activity can help treat type 2 diabetes because the peptide hormones (GLP-1, and GIP) are degraded by DPP4.

Thus, inhibiting DPP4 prolongs the effects of these hormones. 12210 - EGFR: Epidermal growth factor receptor; its deficient signaling is associated with Alzheimer's dementia, whereas its over-expression is a common characteristic of tumor cells.

It is thus an oncogene that is targeted by numerous anti-cancer molecules and drugs.

Many of these are targeted at the tyrosine kinase domain, because hampering its function prevents excessive transduction of the signals these receptors would otherwise send to the nucleus of the tumor cell. 12211 - ESR1: Estrogen Receptor Alpha is critical to many tissue differentiation processes across the body, and has been targeted by various drugs to both enhance and suppress its effects depending on associated conditions.

12212 - FA10: Coagulation factor X is an enzyme in the coagulation signaling cascade for forming blood clots.

It is a serine endopeptidase, and has been targeted by inhibitors to reduce coagulation in medical contexts where that is desirable.
.

RELATED TERMS GLOSSARY AI BETA

Note: Glossary items are a high level summary and may not be 100% accurate.

proteins

Large biomolecules essential for various biological functions.

Scientific: Biopharmaceuticals
Biotechnology / Structural Biology

Proteins are complex molecules that perform a wide range of functions in living organisms, including building and repairing tissues, transporting molecules, and catalyzing chemical reactions.


DUD-E

Directory of Useful Decoys - Enhanced

Acronym: Biotechnology
Pharmacology / Drug Discovery

DUD-E is a benchmark dataset used to evaluate the performance of computational methods for predicting protein-ligand interactions. It contains diverse proteins and their known binding affinities with various small molecules.


drug discovery

The process of identifying and developing new medications.

Process: Biopharmaceuticals
Pharmacology / Research & Development

Drug discovery is a complex multi-stage process that involves identifying promising drug candidates, testing their effectiveness and safety, and ultimately bringing them to market.


simulations

Computer models that mimic real-world processes.

Method: Biopharmaceuticals
Biotechnology / Structural Biology

Simulations are used in biotechnology to study complex biological systems, such as protein folding and drug interactions. They allow researchers to explore different scenarios and predict outcomes without conducting expensive and time-consuming experiments.


AlphaFold

A deep learning algorithm for predicting protein structures.

Software: Biopharmaceuticals
Biotechnology / Artificial Intelligence

AlphaFold is a groundbreaking AI system that can accurately predict the 3D structure of proteins from their amino acid sequences. This has revolutionized structural biology and drug discovery.


Folding@Home

A distributed computing project that uses idle computer processing power to simulate protein folding.

Software: Biopharmaceuticals
Biotechnology / Distributed Computing

Folding@Home harnesses the combined computational power of millions of computers worldwide to perform complex simulations of protein folding. This helps researchers understand how proteins fold and function, which is crucial for drug discovery and other biological research.


Acetylcholinesterase

An enzyme that breaks down acetylcholine in the nervous system.

Enzyme: Pharmaceuticals
Biochemistry / Neuroscience

Acetylcholinesterase is a key enzyme involved in nerve signal transmission. It breaks down the neurotransmitter acetylcholine, allowing for the termination of signals between nerve cells.

PROJECT FOLDING PPD AVERAGES BY GPU

Data as of Tuesday, 14 April 2026 06:35:28
Rank
Project
Model Name
Folding@Home Identifier
Make
Brand
GPU
Model
PPD
Average
Points WU
Average
WUs Day
Average
WU Time
Average
1 GeForce RTX 4070 Ti
AD104 [GeForce RTX 4070 Ti]
Nvidia AD104 6,150,528 103,728 59.29 0 hrs 24 mins
2 GeForce RTX 3090
GA102 [GeForce RTX 3090]
Nvidia GA102 5,295,090 101,028 52.41 0 hrs 27 mins
3 Radeon RX 7900XT/XTX
Navi 31 [Radeon RX 7900XT/XTX]
AMD Navi 31 3,003,669 83,904 35.80 0 hrs 40 mins
4 GeForce GTX 1080 Ti
GP102 [GeForce GTX 1080 Ti] 11380
Nvidia GP102 1,872,474 72,119 25.96 0 hrs 55 mins
5 GeForce RTX 2060 Super
TU106 [GeForce RTX 2060 SUPER]
Nvidia TU106 1,754,560 70,487 24.89 0 hrs 58 mins
6 GeForce RTX 3060 Mobile / Max-Q
GA106M [GeForce RTX 3060 Mobile / Max-Q]
Nvidia GA106M 1,705,443 69,559 24.52 0 hrs 59 mins
7 GeForce GTX 1660 SUPER
TU116 [GeForce GTX 1660 SUPER]
Nvidia TU116 1,134,776 60,339 18.81 1 hrs 17 mins
8 GeForce GTX 1070
GP104 [GeForce GTX 1070] 6463
Nvidia GP104 1,115,774 60,461 18.45 1 hrs 18 mins
9 P104-100
GP104 [P104-100]
Nvidia GP104 958,341 57,487 16.67 1 hrs 26 mins
10 GeForce GTX 1650 SUPER
TU116 [GeForce GTX 1650 SUPER]
Nvidia TU116 747,800 53,105 14.08 1 hrs 42 mins
11 GeForce GTX 1660 Ti
TU116 [GeForce GTX 1660 Ti]
Nvidia TU116 642,344 50,688 12.67 1 hrs 54 mins
12 GeForce GTX 970
GM204 [GeForce GTX 970] 3494
Nvidia GM204 522,478 45,792 11.41 2 hrs 6 mins
13 Quadro T1000 Mobile
TU117GLM [Quadro T1000 Mobile]
Nvidia TU117GLM 378,876 41,971 9.03 2 hrs 40 mins
14 GeForce GTX 1050 Ti
GP107 [GeForce GTX 1050 Ti] 2138
Nvidia GP107 377,342 42,229 8.94 2 hrs 41 mins
15 GeForce GTX 960
GM206 [GeForce GTX 960] 2308
Nvidia GM206 305,221 39,309 7.76 3 hrs 5 mins
16 GeForce GTX 1070 Ti
GP104 [GeForce GTX 1070 Ti] 8186
Nvidia GP104 231,640 11,987 19.32 1 hrs 15 mins
17 GeForce GTX 950
GM206 [GeForce GTX 950] 1572
Nvidia GM206 224,145 35,490 6.32 3 hrs 48 mins
18 GeForce GTX 980 Ti
GM200 [GeForce GTX 980 Ti] 5632
Nvidia GM200 202,133 33,056 6.11 3 hrs 55 mins
19 GeForce GTX 1050 Mobile
GP107M [GeForce GTX 1050 Mobile]
Nvidia GP107M 155,792 32,073 4.86 4 hrs 56 mins
20 GeForce GT 1030
GP108 [GeForce GT 1030]
Nvidia GP108 121,152 28,926 4.19 5 hrs 44 mins

PROJECT FOLDING PPD AVERAGES BY CPU BETA

Data as of Tuesday, 14 April 2026 06:35:28
Rank
Project
CPU Model Logical
Processors (LP)
PPD-PLP
AVG PPD per 1 LP
ALL LP-PPD
(Estimated)
Make