- Immune system dynamics viewed through the lens of population genetics — how does the genetic diversity of the adaptive immune system change over the course of the life of an individual? What does this tell us about pathogen ↔ immune system dynamics?
- Population genetics of microbial communities — to what extent is genetic variation created and maintained? How much genetic material is exchanged between individuals in a community and what effect does this have on their evolution?
- Sequencing error — how can we properly account for sequencing error (or ancient DNA degradation, etc.) when making inferences based on population genetic theory?
- Mathematical disease modelling in structured populations — how does population structure affect the stochastic initial stages of disease invasion to influence the overall outcome (i.e. pandemic or false alarm)?
Some of these interests may merge on occasion, but the more years I spend in research, the more this list seems to grow.
- Prüfer K, Racimo F, Patterson N, Jay F, Sankararaman S, Sawyer S, Heinze
A, Renaud G, Sudmant PH, de Filippo C, Li H, Mallick S, Dannemann M, Fu Q,
Kircher M, Kuhlwilm M, Lachmann M, Meyer M, Ongyerth M, Siebauer M, Theunert
C, Tandon A, Moorjani P, Pickrell J, Mullikin JC, Vohr SH, Green RE, Hellmann
I, Johnson PLF, Blanche H, Cann H, Kitzman JO, Shendure J, Eichler EE,
Lein ES, Bakken TE, Golovanova LV, Doronichev VB, Shunkov MV, Derevianko AP,
Viola B, Slatkin M, Reich D, Kelso J, and Pääbo S. (2014). The complete
genome sequence of a Neanderthal from the Altai Mountains. Nature,
- Jónsson H, Ginolhac A, Schubert M, Johnson PLF, and Orlando L.
(2013). mapDamage2.0: fast approximate Bayesian estimates of ancient DNA
damage parameters. Bioinformatics, 29:1682-4. [doi]
- Orlando L, Ginolhac A, Zhang G, Froese D, Albrechtsen A, Stiller M,
Schubert M, Cappellini E, Petersen B, Moltke I, Johnson PLF, Fumagalli
M, Vilstrup JT, Raghavan M, Korneliussen T, Malaspinas AS, Vogt J, Szklarczyk
D, Kelstrup CD, Vinther J, Dolocan A, Stenderup J, Velazquez AMV, Cahill J,
Rasmussen M, Wang X, Min J, Zazula GD, Seguin-Orlando A, Mortensen C,
Magnussen K, Thompson JF, Weinstock J, Gregersen K, Røed KH, Eisenmann V,
Rubin CJ, Miller DC, Antczak DF, Bertelsen MF, Brunak S, Al-Rasheid KAS,
Ryder O, Andersson L, Mundy J, Krogh A, Gilbert MTP, Kjær K,
Sicheritz-Ponten T, Jensen LJ, Olsen JV, Hofreiter M, Nielsen R, Shapiro B,
Wang J, and Willerslev E. (2013). Recalibrating Equus evolution using the
genome sequence of an early Middle Pleistocene horse. Nature,
- Fu Q, Mittnik A, Johnson PLF, Bos K, Lari M, Bollongino R, Sun C,
Giemsch L, Schmitz R, Burger J, Ronchitelli AM, Martini F, Cremonesi RG,
Svoboda J, Bauer P, Caramelli D, Castellano S, Reich D, Pääbo S, and Krause
J. (2013). A revised timescale for human evolution based on ancient
mitochondrial genomes. Curr Biol, 23:553-9. [doi]
- Johnson PLF, Yates AJ, Goronzy JJ, and Antia R. (2012). Peripheral
selection rather than thymic involution explains sudden contraction in naive
CD4 T-cell diversity with age. Proc Natl Acad Sci U S A, 109:21432-7.
- Johnson PLF, Kochin BF, Ahmed R, and Antia R. (2012). How do
antigenically varying pathogens avoid cross-reactive responses to invariant
antigens? Proc Biol Sci, 279:2777-85. [doi]
- Johnson PLF and Hellmann I. (2011). Mutation rate distribution
inferred from coincident SNPs and coincident substitutions. Genome Biol
Evol, 3:842-50. [doi]
- Johnson PLF, Kochin BF, McAfee MS, Stromnes IM, Regoes RR, Ahmed R,
Blattman JN, and Antia R. (2011). Vaccination alters the balance between
protective immunity, exhaustion, escape, and death in chronic infections.
J Virol, 85:5565-70. [doi]
- Burbano HA, Hodges E, Green RE, Briggs AW, Krause J, Meyer M, Good JM,
Maricic T, Johnson PLF, Xuan Z, Rooks M, Bhattacharjee A, Brizuela L,
Albert FW, de la Rasilla M, Fortea J, Rosas A, Lachmann M, Hannon GJ, and
Pääbo S. (2010). Targeted investigation of the Neandertal genome by
array-based sequence capture. Science, 328:723-5. [doi]
- Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, Patterson
N, Li H, Zhai W, Fritz MH, Hansen NF, Durand EY, Malaspinas AS, Jensen JD,
Marques-Bonet T, Alkan C, Prüfer K, Meyer M, Burbano HA, Good JM, Schultz R,
Aximu-Petri A, Butthof A, Höber B, Höffner B, Siegemund M, Weihmann A,
Nusbaum C, Lander ES, Russ C, Novod N, Affourtit J, Egholm M, Verna C, Rudan
P, Brajkovic D, Kućan Ž, Gušić I, Doronichev VB, Golovanova LV,
Lalueza-Fox C, de la Rasilla M, Fortea J, Rosas A, Schmitz RW, Johnson
PLF, Eichler EE, Falush D, Birney E, Mullikin JC, Slatkin M, Nielsen R,
Kelso J, Lachmann M, Reich D, and Pääbo S. (2010). A draft sequence of the
Neandertal genome. Science, 328:710-22. [doi]
- Reich D, Green RE, Kircher M, Krause J, Patterson N, Durand EY, Viola B,
Briggs AW, Stenzel U, Johnson PLF, Maricic T, Good JM, Marques-Bonet
T, Alkan C, Fu Q, Mallick S, Li H, Meyer M, Eichler EE, Stoneking M, Richards
M, Talamo S, Shunkov MV, Derevianko AP, Hublin JJ, Kelso J, Slatkin M, and
Pääbo S. (2010). Genetic history of an archaic hominin group from Denisova
Cave in Siberia. Nature, 468:1053-60. [doi]
- Johnson PLF and Slatkin M. (2009). Inference of microbial
recombination rates from metagenomic data. PLoS Genet, 5:e1000674. [doi]
- Green RE, Malaspinas AS, Krause J, Briggs AW, Johnson PLF, Uhler C,
Meyer M, Good JM, Maricic T, Stenzel U, Prüfer K, Siebauer M, Burbano HA,
Ronan M, Rothberg JM, Egholm M, Rudan P, Brajković D, Kućan Ž, Gušić I,
Wikström M, Laakkonen L, Kelso J, Slatkin M, and Pääbo S. (2008). A
complete Neandertal mitochondrial genome sequence determined by
high-throughput sequencing. Cell, 134:416-26. [doi]
- Johnson PLF and Slatkin M. (2008). Accounting for bias from
sequencing error in population genetic estimates. Mol Biol Evol,
- Chang JT, Palanivel VR, Kinjyo I, Schambach F, Intlekofer AM, Banerjee A,
Longworth SA, Vinup KE, Mrass P, Oliaro J, Killeen N, Orange JS, Russell SM,
Weninger W, and Reiner SL. (2007). Asymmetric T lymphocyte division in the
initiation of adaptive immune responses. Science, 315:1687-91. [doi]
- Briggs AW, Stenzel U, Johnson PLF, Green RE, Kelso J, Prüfer K,
Meyer M, Krause J, Ronan MT, Lachmann M, and Pääbo S. (2007). Patterns of
damage in genomic DNA sequences from a Neandertal. Proc Natl Acad Sci U S
A, 104:14616-21. [doi]
- Cross PC, Johnson PLF, Lloyd-Smith JO, and Getz WM. (2007). Utility
of R0 as a predictor of disease invasion in structured populations. J R
Soc Interface, 4:315-24. [doi]
- Getz WM, Lloyd-Smith JO, Cross PC, Bar-David S, Johnson PLF, Porco
TC, and Sánchez MS. (2006). Modeling the invasion and spread of contagious
disease in heterogeneous populations. In Z Feng, U Dieckmann, and SA Levin,
editors, Disease Evolution: Models, Concepts and Data Analyses,
AMS-DIMACS Series, pages 113-44. American Mathematical Society, Providence,
- Johnson PLF and Slatkin M. (2006). Inference of population genetic
parameters in metagenomics: a clean look at messy data. Genome Res,
- Cross PC, Lloyd-Smith JO, Johnson PLF, and Getz WM. (2005). Duelling
timescales of host movement and disease recovery determine invasion of
disease in structured populations. Ecol Lett, 8:587-95. [doi]
- International Human Genome Sequencing Consortium. (2004). Finishing the
euchromatic sequence of the human genome. Nature, 431:931-45. [doi]
- Bulyk ML, Johnson PLF, and Church GM. (2002). Nucleotides of
transcription factor binding sites exert interdependent effects on the
binding affinities of transcription factors. Nucleic Acids Res,
All software is distributed under the GNU General Public License.
- PIIM: Population Inference In Metagenomics, with recombination (version 2)
This program calculates maximum likelihood estimates
of θ=2Nu (where u is the per-site mutation rate)
and ρ=2Nc (where c is the per-site rate of
initiation of recombination). It also reproduces the
frequency-spectrum functionality from the previous version to
estimate R=Nr (where r is the exponential growth rate).
Input data is genome-level population data of variable sample depth and quality (e.g. metagenomic data).
For details on the method, see:
Johnson, PLF and Slatkin M. 2009. "Inference of microbial recombination rates from metagenomic data." PLoS Genetics.
Previous version can be found here.
R package for approximating stochastic simulations (continuous-time Markovian processes) that implements the adaptive tau leaping algorithm of Cao et al. (2007) The Journal of Chemical Physics.
Think of differential equations forced to take integer values and allowing for stochastic effects at low numbers. Similar in spirit to GillespieSSA but a bazillion times faster (± a zillion) thanks to implementing in C instead of pure R.
Download page from CRAN.
Useful tools, BibTeX styles, etc.
Sometimes I feel like I spend most of my time shuffling data about and fighting with computers, so I've written many a tool to make my life easier. Perhaps these will be useful to someone else. I use Linux, so most tools will run on Mac without trouble, but Windows could be a headache.
All tools are distributed under the GNU General Public License. Give me a shout if you find a bug or if you find a tool particularly useful. The extent of documentation varies, but everything displays at least a brief usage statement if you run it without parameters.
- FASTA manipulation
- Scripts for manipulating fasta files in descending order of bugfreeness / awesomeness:
- FaIndex.pm -- Perl module that creates an index of sequences in fasta file(s) and uses it to extract subregions. Disk access is via memory mapping, which has advantages (fast) and disadvantages (no single fasta file can be >2Gb). Requires the File::map package from CPAN.
- fa_extract_many -- very quickly extracts regions from fasta files using the above FaIndex.pm module (will look for module in standard directories and in ~/bin).
- fa_wrap -- wrap fasta sequence to specified width
- fa_length -- list sequence ids and lengths
- Improvements to standard bioinformatic tools
- Flat file manipulation
- FF_Index.pm -- A clever (if I do say so myself) Perl module for indexing flat files for quick data retrieval. Crucially, this is easy to use and creates a separate index file instead of mucking about with the original file.
- groupby -- approximates "group by" functionality of SQL, but takes tab-delimited flat files with one line per record (must already be sorted according to grouping keys).
- Queueing scripts
- Condor provides an elegant queueing system for running programs on a cluster of machines (either dedicated compute nodes or temporarily unused desktops). However, the supplied interface makes submitting jobs a pain*. Submitting should be as easy as the supplying the exact same command line that you would use if executing locally, i.e.:
./my_program -f some_options > my_output
qsub './my_program -f some_options > my_output'
I have a suite of scripts that does exactly this for Condor.
- devEMF is an R package that provides an EMF (enhanced metafile) graphics driver to make producing EMF graphics as easy as EPS/PDF/PNG/etc. EMF is a vector based format, so it will always look good no matter how much you enlarge it. I wrote this driver out of frustration with both OpenOffice and Microsoft Office's lousy importation of EPS graphics (they both import EMF files seamlessly).
- BibTeX style files (bst) for biology journals
Why, oh why, do so few journals supply bibliography style files?
I finished my PhD in 2009 at UC Berkeley in Biophysics with a designated emphasis in computational biology while studing theoretical population genetics with Montgomery Slatkin. Now I have a postdoc with Rustom Antia at Emory University.
If you're really curious, you can check out my cv.
Why the L F? Turns out there are a few other "Johnson"s out there in the world and even other "Philip Johnson"s. But I'm not an architect and certainly not a Berkeley law professor. Particularly with the latter, hilarity may ensue when mixing us up.
Lab: +1 404 727 1765
Fax: +1 404 727 2880
Department of Biology
1510 Clifton Rd NE, Rm 2006
Atlanta, GA 30322
Updated Feb 2014.