I dropped out of a Ph.D. program 4 years ago and joined the start-up tech world a brash change for a synthesist. But now I’m back, committing a “financial martyr” as I’ve been told before, so why am I here?
When I was 20 I wanted to start putting chemistry into the computer, a lot of my friends were programmers and I thought this was the most amazing thing. It started off with trying to put the periodic table as class objects and their attributes labeled. The minute we got to different organic compounds the project crumbled….we were not ready.
I’m sitting here in this PhD program and sometimes I often go kind of crazy, there’s a lot of pressure. During these times where I question myself, I often look at my past and where I came from, like what am I actually doing. I do this as an affirmation to myself that I can keep going because I tell myself I have done this before, I can do it again and the next time will be better.
So I’ll go through it and here’s some honest thoughts to myself.
Principles of Microeconomics…
PhD is a tough battle, a lot of days I debate whether I am smart enough for this line of work? I had this crazy idea but it crumbled before me today and it actually did make me really sad. I’m going through some crazy emotions right now and maybe my code is getting weirder or better? You can decide. Something that came out of this was I needed to model how cheminformatics relates to things.
So I went with a Newick tree was some tree algorithm to draw phylogenetic trees for sequences that aligned with each other. It can…
You hire a cheminformatician — she or he will come with a bag of tricks. We’ve seen things down there in the data. Here’s a 1 minute trick for my fellow peers that I’ve seen — as a fun thing :).
So let’s start in linux or mac os (idk about windows — I’m not a windows user):
$ echo 'where is that electron?'
>>> where is that electron?
Then type in
history and my last line:
$ history>>> history
>>> 10260 echo 'where is that electron?'
Now let’s put a space before the
echo command to treat it…
I’m back! Before the next wave of exam hits. I’ve got a lot of chemical compound data to process and using a pretty complex but simple pipeline (my own). So I need to choose my tools wisely. Well, what monitoring tool can actually give me value rather than just look pretty and….you know me…..is it pythonable?
I’m looking for cpu load speeds, how much it takes up, can I obtain the data somehow for benchmark tests, and is it interoperable.
After looking through Prometheus and other ones I stumbled upon glances for python. …
2nd Rotation begins of a wide-eyed cheminformatician in training! (first one is still kinda not over….talk about that in another blog).
I was tasked with screening a chemical database my end goal I’m still trying to determine. I had a pick of the litter, which database do I choose? the classic ZincDB, ChemBL, etc. or one of the harder ones the Enamine REAL DB. Of course, I went for the latter — the challenge is more fun.
Okay, we are looking 1.36 Billion compounds split into 20 parts
.smiles files each containing roughly ~68 Million compounds.
Ah Log files — gotta love them and hate them. During my rotation project for the PhD I have to often check these logs files for the progress of molecular dynamic simulations (I’m using GROMACS btw). These simulations can take days to run which means I have to do a daily check usually 3 times a day
Now this task is very cumbersome…..because if I have 12 simulations running which means 12 log files being produced
md_3.log(standard log file for GROMACS) and this is rotation so imagine after that. So to do my daily check here’s the time breakdown
So here I am, First Year PhD Student, and it’s been a daunting process to get into but if you’re like me and have these similar requirements then this blog is for you!
Note: For utility functions like
initiate_universe I’m gonna include it at the bottom for folk but to make the initial code cleaner :).
If you’ve come across this article, you must be a python-dev (like myself) that has come across MDAnalysis but reading over the docs you’ve discovered you know nothing about molecular dynamics. So let’s start at the basics: Take the distance between two atoms in space over a trajectory.
I’ve got two atoms in space first I’m going to load up my topology and trajectory using
mda and initiate the universe (sounds so cool):
def initiate_universe(topology, trajectory):
universe = mda.Universe(topology, trajectory)
return universeif __name__ == '__main__': # File Setup
current_directory = os.getcwd()…
So I’ve entered the realm of molecular dynamics and I am constantly hearing this term “Force Fields”. The term has felt so loose and not really quantifiable perhaps dawning on magic…there is something going on under the hood that I think can only really be discovered by looking at the raw data.
Here’s a result of that….
Since I joined the University of Maryland (UM) cohort figured I would check out what the folk in this new city have on the block.
CHARMM is one of these force fields that hangs in the molecular dynamics space and stems from my…