r10 - 19 Jun 2006 - 16:42:45 - JonathanGemmellYou are here: TWiki >  Main Web  >  TWikiUsers > JonathanGemmell

Research Interests

Protein Folding

  • here
  • there
  • everywhere

Protein Design

  • The Inverse Protein Folding problem, also called Protein Design, spans the boundaries of both the computer and biological sciences. The problem consists of determining a sequence of amino acids to compose a protein that will, due to their combined bio-chemical properties, fold into a predetermined three-dimensional structure. This three-dimensional structure, or conformation, determines the effect of the protein on its environment. Hence, success in the Inverse Protein Folding problem would allow scientists the ability to redesign existing proteins with additional or enhanced functionality or even to design new proteins with novel functionality. Nevertheless, while the impact of a solution to the Inverse Protein Folding is apparent, the computational challenges are extraordinary, since the search space explodes exponentially as the length of the protein increases. We demonstrate a Monte-Carlo model for sampling the search space coupled with energy functions for evaluating conformations. These energy functions are based on statistical propensities mined from the Protein Data Bank. We further demonstrate a framework based on the Illinois Bio-Grid Toolkit which utilizes grid technologies to give the capabilities of engaging massive amounts of computational power to the problem. paper

CSC542 - Internship at The University of Chicago

Description

The student will spend fifteen hours per week in the laboratory of Dr. Sosnick, a professor and scientist at the University of Chicago’s Department of Biochemistry and Molecular Biology and the Institute for Biophysical Dynamics.

In particular, the student will aid the laboratory by writing algorithms to take advantage of statistically propensities found in biological databases that may be used to tackle the Protein Design problem. Primarily, the algorithms will allow researchers to select amino acids based upon desired characteristics. These characteristics will include nearest neighbors, where the identity of the two neighboring amino acids along the main-chain is known; n-nearest neighbors, where the identity of the neighboring amino acids is known for a chosen distance along the main-chain; phi/psi selection, in which amino acids are chosen based upon their likelihood to conform to the desired structure; phi/psi selection for neighbors, in which amino acids are chosen based upon their ability to influence their neighbors phi/psi angles; amino acid profiles detailing the amino acid's hydrophobicity, acidity, charge, size, etc; and three dimensional orientation propensities, in which the orientation of neighboring amino acids in three-dimensional space is calculated to determine the influence on the amino acid identity. Other traits may be incorporated. Ultimately, the algorithm will dynamically search a database containing millions of experimentally determined amino acid conformations and return a result based upon the desired characteristics.

This algorithm will be incorporated into the IBG Toolkit and will be made available to the broader scientific community. Inclusion in the IBG Toolkit will aid in the design of proteins during Monte-Carlo simulations lasting hours or days. Furthermore, a web application will be written to allow researchers to search for amino acid propensities in a given 3D conformation. On one hand, the user will be able to investigate the propensities of amino acids to assume the characteristics described above. This mode will allow the user to investigate amino acids in isolation. On the other hand, users will also be able to upload a PDB (Protein Data Bank) file describing the characteristics of an entire string of amino acids composing a protein, and generate a file containing the amino acid propensities for each position.

Readings

In consultation with his advisor, Prof. Angulo, and his sponsor, Dr. Sosnick, the student will study relevant papers on biochemistry, computer science, and project management.

Deliverables

First, the student will provide weekly reports on progress, detailing the status of the project. Second, the student will provide a final report on the results of the project. Third, the student will provide the code for the project, including test cases and documentation.

Weekly Log

  • March 27th
    • Wrote CSC 542 description, reviewed course with Dave, and submitted proposal.
    • Began Twiki page.
    • 11:30 Group meeting, discussed basin hopping in OOPS and basic prediction.
    • 12:30 Exploring Protein Conformational Changes by Replica Exchange and Path - Peter Bolhuis, Dept. of Chemistry, University of Amsterdam
    • 1:30 Design of metal binding proteins.
    • 4:00 IBG meeting

  • March 31st
    • Packaged IBG Protein Designer
    • Included test cases for Vector, Atom, AminoAcid
    • Discussed "Folding inside of loops" with Tobin
    • Began installation of software on new server

  • April 3rd
    • Group Meeting
    • Debated the binning of the new (orientation-atom)-(orientation-atom) distance propensity algorithm.
    • Discussed cryoEM, for discovering the shape of folded RNA
    • Bio-Grid Meeting. Discussed poster submission for CTIRS

  • April 7th
    • PhD exam

  • April 14th
    • Prepared abstract for funding on an Interactive Programming Environment for Molecular Biology
    • Delveloped code to select amino acids for a position in an protein based upon statistical potenentials

  • April 17th
    • Prepared slides for CTIRS conference
    • Continued developing IBG_ProteinDesign code
    • Design Group Meeting
      • Demo for a biological statistics package
      • Improving NMR Crystalography structure predictions by "folding in loops"
    • IBG Meeting
      • Presented a lecture on Inverse Protein Folding
      • Critiqued the presentation of other IBG members for the CTIRS

  • April 21st
    • Developed user interface for amino acid search

  • April 24th
    • Continued user interface for amino acid search
    • IBG Meeting
    • Design Group Meeting
      • Percipitating DNA in the presence of proteins

  • April 28th
    • Continued delveloping code to select amino acids for a position in an protein based upon statistical potenentials.
    • Created a new class, TrimProtein, for faster searching.
    • Spoke with Tobin about using threading and the normalized substitution matrix for predicting protein secondary structure, in order to select initial basin assignments for OOPS.

  • May 1st
    • Design Group Meeting
    • IBG Meeting
    • Installed struts, tutorial

  • May 5th
    • Designed webpages, index, project, links, people, design, select amino acid etc
    • Edited strut-config.xml

  • May 8th
    • Design Group Meeting
    • IBG Meeting
    • Continue design of the website framework

  • May 12th
    • Implemented the upload functionality to the site. You can now upload PDB files to the website, and create an IBG_protein object on the server.

  • May 15th
    • Design Group Meeting
    • IBG Meeting
    • The library of 4000 protein is formated to be quickly read into an array of trim proteins.

  • May 18th
    • initial search is done, just checking phi/psi/e of the postion against the library. Graphics display almost done.

  • May 22nd
    • Design Group Meeting
    • IBG Meeting
    • Graphics display complete, an uploaded protien will show the propensities for each amino acid on the display

  • May 26th
    • Migraine

  • May 29th
    • Design Group Meeting
    • IBG Meeting
      • Team meeting on techniques to implement a molecular modeling programming environment

  • June 2nd
    • Implemented the search, included phi/psi/e of neighbors i+/-3. Still need to include identities.

  • June 5th
    • Design Group Meeting
    • IBG Meeting
      • Overview of statistical packages and molecular dynamics software
      • Presentation of techniques to dectect noise in Mass Spectrometry
    • Worked on the search implementation, included nearest neighbor identities

  • June 9th
    • Added documentation for the site, project, people, links, etc
    • Worked further of the search

  • June 12th
    • Design Group Meeting
    • Presentation of software to group
      • alternatives to burial level?
      • no proline options
      • i,i+1 interactions in the design, and other long-range,
      • color bar Dope-Cbeta with a current sequence on another line.
      • option to default to the input PDB's sequence
      • option to default to i,i+/-1 for all
      • additional line
      • In Select aa Window, a blank in the ID field represents the default neighbor.
      • turn off environment
Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r10 < r9 < r8 < r7 < r6 | More topic actions
 
Illinois Bio-Grid
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback