Structure Gazing with Chimera


Today we will use Chimera to analyze X-ray crystal structures of the Rho Transcription Terminator. The Chimera interface can be a bit overwhelming at first, so we will use the next hour of lecture today to complete some short exercises aimed to help acclimate us to the interface.

I suggest the following resources if you are interested in learning about X-ray Crystallography:
Evaluating X-Ray Crystal Structure Papers by James Berger
Awesome Movies by James Holton
Crystallography Made Crystal Clear
UC Berkeley Course offered by Paul Adams on Macromolecular Crystallography

If your course installation worked properly, you should all be able to open chimera by simply typing "chimera" in the terminal.
Introduction to Chimera documentation

Overview of pulldown menus
  • File: Opens, saves, and closes files
  • Select: Various methods for selecting things
  • Actions: Modulates display options
  • Presets: Makes nice displays for publication
  • Tools: Modules for doing cool stuff
  • Favorites: Your favorite stuff
  • Help: Various ways to get help

Basic Controls: Loading Structures, Command Line, Select, Color and Focus

First, let's load up a structure. We'll start by looking at the open ring Rho structure (File -> Fetch by ID: 1PV4). We could also save the file locally and open it.
You can use the command line (Tools->General Controls->Command Line) to make your life easier. The quick reference guide of Chimera commands can be found here:

Quick Reference Guide
Structures are organized into (in order of decreasing complexity): models, chains, residues, and atoms. Any individual PDB file that is read into Chimera will be assigned a model number, which are visible as "Active models" near the toolbar.
Unarguably the most useful command is select, which can be abbreviated as sel. If I wanted to select residues 60-120 from model 0, chains A and B, I would type:

sel #0 :60-120.A-B
Again, the format for these commands is:

sel #MODELNUMBER :RESIDUERANGE.CHAINS

(note: you do not need all of these values because any blank spaces will be filled in with all available objects. For example, sel #0 :60-120. will select that residue range in model 0 for ALL available chains)
If we wanted to change the color of these selected residues, we could simply navigate to the Color tool (Actions->Color) and pick the color that we'd like to use. Alternatively, we could change the background color (Actions->Color->All Options->Background, Select Color) - go ahead and set the background to white.
Another useful command is the "focus" command. With no arguments, it will zoom out and show you all available structures. The combination of "focus sel" is especially powerful, as it will automatically reposition the viewing window to view whatever you currently have selected.
Mini-exercise 1: Rho is a two-domain homohexamer. Each monomer is composed of an N-terminal OB (oligonucleotide binding) fold and a C-terminal RecA domain. The border between these two domains is approximately residue 129.Save an image similar to the one shown below, in which the N-terminal domains are colored Red and the C-terminal domains are colored Blue:

miniex1.jpg
Displaying Sequences

Another very useful control can be found in Tools->Sequence->Sequence, which displays the sequence for chains of our choosing. The selection is synchronized with your current selection in Chimera, which is very useful. What portion of the residues shown in the sequence actually show up in the model? What is going on with the rest of the residues, and what do you think a red box around a residue means?
Saving Sessions

It is often the case that we'd like to save the current state of a Chimera session rather than needing to re-capitulate it at a later date; this can done very easily (File->Save Session As...).
Saving Structures

Often you will want to save some structure that you have edited as one or multiple PDB files (File->Save as PDB). Note that you will need to include either $number or $letter in the file name if you plan to save multiple models as multiple PDB files (if $number is used, the files will be designated by different numbers at this position in the file name)
The Delete Command

The delete command (abbreviation "del" works just the same) is a necessary evil in Chimera. It's necessary because it allows you to generate objects that are subsets of any structure, and it's evil because there is no "back" button in Chimera. I strongly recommend two rules whenever you use the "delete" command:

1) Save a session!

2) Select what you are planning to delete, and type focus sel to make sure it's only what you want to delete.

At this point, and only at this point, it is potentially safe to type delete sel.
Mini-exercise 2: Generate two separate models of a Rho monomer (just use Chain A), one that contains the N-terminal domain of Rho and another that contains the C-terminal domain.
Structural Alignments

Mini-exercise 3: Align Rho against a structurally related protein, the F1 ATPase. Open both the monomeric N-terminal and C-terminal domains from Mini-exercise 2, and then open a structure of the F1 ATPase (PDB code 1W0K).

Use the MatchMaker tool (Tools->Structure Comparison->MatchMaker) and figure out whether the N-terminal or C-terminal domain aligns better with the F1 ATPase.

Recapitulate the figure below:


miniex3.jpg
Handling Multiple Models

Save a session now, since we're about to make an active attempt to mess up our previous work. Alignments are by no means permanent in Chimera, which is clear when you attempt to move only one model at a time. De-select one of the two active models and move a structure; your alignment is immediately lost.
Changing Depictions, Using the Distance Tool and Displaying Hydrogen Bonds
A crystal structure of the closed conformation of Rho is also available (3ICE) so go ahead start a new session and then load up that structure. There are several useful representations in Chimera (Presets) that are useful for different purposes. Select the "all atoms" representation.

Mini-exercise 4: Calculate the distance between the N- and C-terminus of the A subunit of the 3ICE structure using the distances tool (Tools->Structural Analysis->Distances). (Hint: )

We can also have Chimera calculate and display hydrogen bonds. Select the RNA (Chain G), and then use the FindHBonds tool (Tools->Structural Analysis->FindHBonds) with the "Only find H-bonds with one end selected" option checked.

Comparing Different Conformations of the Same Protein
We will compare to the open ring structure, but we need the chains to be numbered in the same order (looking down onto the N-terminus, the chains are labeled A-F in the clockwise orientation in the closed ring structure, and A-F in the counterclockwise orientation in the open ring structure). I have loaded a version of the open ring (1PV4) structure with the chains re-numbered to fix this issue for all of you: Open Ring Rho Structure
First, go ahead and align the open and closed ring Rho structures using MatchMaker, and force Chain A of the open ring structure to align with Chain A of the closed ring structure. What differences can you see between these two structures besides the fact that the ring is open or closed?
Next, let's try out the Morph Conformations utility in Chimera (Tools->Structure Comparison->Morph Conformations). I generally either move or delete the models on which the morph is based to make it easier to visualize. While these morphs are not necessarily biologically relevant (ideally we would want crystal structures of a variety of intermediates between these two structures), they can give us a rough guess at the sort of conformational changes that need to occur to get from one state to another.
The Yale Morph Server can handle much more complicated inputs (DNA/RNA) and (I believe) runs more sophisticated calculations to determine what sort of intermediates are energetically plausible, but it's also much more challenging to set up (you must have the exact same number of atoms in both structures, numbered the exact same way). Here's a link to their website in case you ever need to use it:
Yale Morph Server

Surface Representations, Solvent Exposed Surface Area and Attributes
In order to calculate solvent exposed surface area, we first need to calculate a hydrophobicity surface. This is easily done using the presets (Presets->Hydrophobicity Surface). Note how much more compact the structure appears in this representation than it did in the ribbons representation.
Now the solvent exposed surface area has been calculated for each residue, and we can render the structure by that attribute. Change back to the ribbons preset, and open up Render by Attribute (Tools->Structural Analysis->Render by Attribute). Select residues/areaSAS, set one cutoff color to 40, another to 50, select areaSAS, and press accept. Do the results make sense?
We can also output these values to a file - go ahead and do this now for areaSAS, as we will use this file later in the exercises (File->Save Attributes).
We can also render the structure by an arbitrary set of values. I did a large alignment to calculate the percent conservation at each position in the E. coli Rho structure, and then put that data in a format that could be interpreted by Chimera, linked to here: Percent Conserved Attribute File
Go ahead and load this attribute file (Tools->Structural Analysis->Define Attribute) and render by this attribute. Do any particular regions appear to be very highly conserved?

Note: Generating the attribute file that I've linked to above can, in theory, be done without any coding knowledge because interpretation of sequence alignments is a built-in functionality in Chimera. Here are some brief steps that you can follow to do this on your own that, admittedly, are MUCH easier than writing code for this purpose:
1) Go to NCBI, go to the page for the protein of interest, and then click BLink to cull through pre-computed BLAST hits.
2) Restrict the BLink results as appropriate (for Rho, I restrict to Bacteria only and do not display redundant sequences).
3) Set the display results to 500/page, and then copy/paste the accession numbers on each page to your favorite text document without formatting.
4) Upload the file from step 3 to Batch Entrez (select the Protein database) and save the results.
5) Get the sequence for the structure you are interested in from the PDB and add it to the top of your file from step 5. Save this new file with the .fasta extension (example file here).
6) Run MAFFT (download link) using the file from above, give the output file a .fa extension, and put your results in fasta format in order (option 4). I run the alignment with default parameters.
7) Download the sequence alignment file (example: Rho Alignment) and open it with Chimera. You can either let Chimera search for the relevant structures (Structure->Load Structures in the alignment pop-up window) or open it yourself.
8) Select the sequence conservation attribute you'd like to display with (Tools->Depiction->Render by Attribute, select residues) and press "Accept."
9) In the case of homomeric complexes, you may need to render each subunit individually by re-setting the chain that is matched up with the alignment (Structure->Associations in alignment pop-up window) and repeat step 8.

Exercises


Exercise 1: Load up the Bicyclomycin-bound Rho structure (1XPO) - is the Rho ring open or closed? What is your best guess for the structural basis of the several known bicyclomycin resistance mutations shown below? (Hint: the *swapaa* command may be helpful for visualizing these mutations.)

S266A,S266C

G337S

Exercise 2: Although slightly over-simplified, PDB files are mostly just a list of atom identities with x,y,z coordinates and a connectivity map. Write a python script that is capable of parsing any PDB file and will generate an output file in the following format:

ResidueID# ChainLetter xCoordinate yCoordinate zCoordinate /n


Exercise 3: Write a function that, given two positions inputted in the format of (xcoord,ycoord,zcoord) will output the distance between these two positions. Test your function using an arbitrary pair of atoms from your output file from Exercise 2 and validate your results in Chimera.


Exercise 4: Write a function that, given an attribute file name, returns a list of dictionaries that corresponds to chains A-F (with one dictionary per chain) keyed by residue number and with the attribute as the value. Test your function on both the SAS attribute file and on the percent conserved attribute file.