jamimmunology: HLA

Sunday, 11 February 2018

High-throughput immunopeptidomics

In my PhD I focused on studying the complexity of the immune system at the level of the T cell repeptor. Recently I’ve been getting in to what happens on the other side of the conversation as well; in addition to looking at TCR repertoires I’m increasingly playing with MHC-bound peptide repertoires too.

Immunopeptidomics is a super interesting field, with a great deal of promise, but it’s got a much higher barrier to entry for research groups relative to something like AIRR-seq. Nearly every lab can do PCR, and access to deep-sequencing machines or cores becomes ever cheaper and more commonplace. However not every lab has expertise with fiddly pull downs, while only a tiny fraction can do highly sensitive mass spec. This is why efforts to make immunopeptide data generation and sharing easier should be suitably welcomed.

One of the groups whose work commendably contributes to both of these efforts is that of Michal Bassani-Sternberg. For sharing, she consistently makes all of her data available (and is seemingly a senior founder and major contributor to the recent SysteMHC Atlas Project), while for generation her papers give clear and thorough technical notes, which aid in reproducibility.

However from the generation perspective this paper (which came out at the end of last year in Mol. Cell Proteomics) describes a protocol which – through application of sensible experimental design – should result in the easier production of immunopeptidomic data, even from more limited samples.

The idea is to basically increase the throughput of the methods by hugely reducing the number of handling steps and time required to do the protocol. Samples are mushed up, lysed, spun, and then run through a variety of stacked plates. The first (if required) catches irrelevant, endogenous antibodies in the lysates; the next catches MHC class I (MHC-I) peptide complexes via bead-cross-linked antibodies; the next similarly catches pMHC-II, while the final well catches everything else (giving you lovely sample-matched gDNA and proteomes to play with, should you choose). Each plate of pMHC can then be taken and treated with acid to elute the peptides from their grooves, before purification and mass spec. It’s a nice neat solution, which supposedly can all be done with readily commercially available goodies (although how much all these bits and bobs cost I have no idea).

Crucially it means that you get everything you might want (peptides from MHC-I/-II, plus the rest of the lysates) in separate fractions, from a single input sample, in a protocol that spans hours rather then days. Having it all done in one pass helps boost recovery from limited samples, which is always nice for say clinical material. Although I should say, ‘limited’ is a relative term. For people used to dealing with nice, conveniently amplifiable nucleic acids, tens to thousands of cells may be limiting. Here, they managed to go down as low as 10 million. (Which is not to knock it, as this is still much much better then hundreds of millions to billions of cells which these experiments can sometimes require. I don’t want everyone to go away thinking about repurposing their collection of banked Super Rare But Sadly Impractically Tiny tissue samples here.)

So on technical merit alone, it’s already a pretty interesting paper. However, there’s also a nice angle where they test out their new protocol on an ovarian carcinoma cell line with or without IFNg treatment, which tacks on a nice bit of biology to the paper too.

You see the things you might expect – like a shift in peptides seemingly produced by degradation from the standard proteasome to more of those produced by the immunoproteasome – and some you might not. Another nice little observation which follows on perfectly from this is that you also see an alteration in the abundance of peptides presented by different HLA alleles: for instance the increased chemotryptic-like degradation of the immunoproteasome favours the loading of HLA-B*07:02 molecules, due to making more peptides with the appropriate motif.

My favourite observation however relates to the fact that there’s a consistent quantitative and qualitative shift in peptidomes between IFNg treated cells and mock. This raises an interesting possibility to me, about what should be possible in the near future, as we iron out the remaining wrinkles in the methodologies. Not only should we learn about what proteins are being expressed, based on which proteins those peptides are derived from, but we should be able to infer something about what cytokines those cells have been expressed to, based on how those peptides have been processed and presented.

Thursday, 15 August 2013

5 minute PyMOL tutorial

I'm very lucky to have a pretty great PhD project, deep-sequencing T cell receptor repertoires. It's an interesting, exciting field, that I have only one major complaint about - I don't get to make any pretty pictures.

It sounds like a minor complaint, but it makes a difference when it comes to presentation time. My main output is sequence data; at best I produce some nice graphs, which rather lack the the impact of say, some beautiful confocal microscopy.

In order to get nice pictures to open a presentation on, I like to just fire up some molecular visualisation software and get some nice screenshots of whatever molecules are relevant to the talk. In case others wanted to do similarly - make nice pictures, without needing to become an expert in molecular visualisation or modelling - but didn't know how, here's a mini-tutorial as to how I do it.

First off, you need some visualisation software. If you're a student or teacher, I think you can get the excellent PyMOL for free, which I'll use in this blog, although there are others. I personally cut my visualisation teeth on a combination of Jmol, and (the now decidedly retro) RasMol - what you learn in one is broadly applicable to the others.

Let's get a structure to work on - head on over to the Protein Data Bank and look up something you're interested in. You can search by molecule name, ID (if you have it, you can find these in structural papers) or author name.

We need to download the PDB file for a molecule of interest, which is basically just a text file containing a set of x, y, z coordinates for all the atoms in that molecule (as described by solving the crystal structure, for example).

I've downloaded the PDB for 1MI5, a crystal structure for the LC13 TCR bound to HLAB8 complexed with a peptide from the EBNA 3A molecule of Epstein Barr Virus.

Fire up PyMOL - there's a two windows, a viewer and an interface - and open your PDB (note that if you already know the ID of your molecule, you can use this to open the file without downloading the PDB with the fetch command, e.g. 'fetch 1zgl').

At first it'll probably look like a big mess.

Orient yourself. It's pretty easy to move about with mouse; it's just hard to remember which button does what. Handily, they designers provided a nice little reminder in the bottom ride hand side.

Basically, left mouse button rotates, middle button translocates, and right mouse button zooms. Have a play, get used to it, it probably doesn't matter much anyway - if you're anything like me you'll forget it every time and have to rediscover it every time.

Now let's make it look pretty.

By default, all molecules are selected. We can use the five buttons in the top right to perform various tasks on whatever's currently selected.

The left column represents the different selections. The buttons on the right are the tasks that can be applied to those selections. Left to right: Action; Show; Hide; Label; Color

Unless I'm looking at specific interactions, I like cartoon display, as I think this makes it easiest to see generally what goes where in a structure, particularly if there's more than one polypeptide present. Click 'S' for show, then choose cartoon.

It looks somewhat fuzzy now, as it's showing the cartoon display overlain on everything else that was there. To get rid of all the fluff, go the 'H' (hide) menu, and hide the lines (default layout) and the waters (sporadic red dots).

Much better.

Now we want to be able to tell our different molecules apart. This is where the PDB web entry for our molecule really comes in handy, as it tells us what letter is used to refer to each separate molecule as. This allows us to make different selections, which we can then use to apply actions to particular molecules or subsets of molecules.

So, looking at our entry, I can enter the following code into the command line area of the interface window to generate our selections.

 select MHC, (chain a)  
 select B2m, (chain b)  
 select peptide, (chain c)  
 select alpha, (chain d)  
 select beta, (chain e)

TCR-people, note that also for the three or four other TCR-pMHC complexes I've tried this on, they've all used the same letters for each chain, so this might be the convention.

You can then use the 'C' color button on each selection to colour them separately. I've arranged the molecule so you see down the groove of the MHC molecule, as this shows the interaction site of the whole complex well.

However, I always think it's a bit hard to see the peptide in amongst all the other molecules, so I've gone back and changed it to show as spheres.

After that, we just need to save a high-resolution image, which we can get through ray tracing:

 ray 2400, 2400

Just save the image after ray tracing. I then cropped this down using Gimp.

All done!

When using crystal structures that you didn't produce, remember to give proper attribution; either cite the original paper, or provide the PDB ID.

For a more in-depth understanding of PyMOL, why not read the manual or check out the PyMOL wiki.