PhD is a tough battle, a lot of days I debate whether I am smart enough for this line of work? I had this crazy idea but it crumbled before me today and it actually did make me really sad. I’m going through some crazy emotions right now and maybe my code is getting weirder or better? You can decide. Something that came out of this was I needed to model how cheminformatics relates to things.
So I went with a Newick tree was some tree algorithm to draw phylogenetic trees for sequences that aligned with each other. It can…
You hire a cheminformatician — she or he will come with a bag of tricks. We’ve seen things down there in the data. Here’s a 1 minute trick for my fellow peers that I’ve seen — as a fun thing :).
So let’s start in linux or mac os (idk about windows — I’m not a windows user):
$ echo 'where is that electron?'
>>> where is that electron?
Then type in
history and my last line:
$ history>>> history
>>> 10260 echo 'where is that electron?'
Now let’s put a space before the
echo command to treat it…
I’m back! Before the next wave of exam hits. I’ve got a lot of chemical compound data to process and using a pretty complex but simple pipeline (my own). So I need to choose my tools wisely. Well, what monitoring tool can actually give me value rather than just look pretty and….you know me…..is it pythonable?
I’m looking for cpu load speeds, how much it takes up, can I obtain the data somehow for benchmark tests, and is it interoperable.
After looking through Prometheus and other ones I stumbled upon glances for python. …
2nd Rotation begins of a wide-eyed cheminformatician in training! (first one is still kinda not over….talk about that in another blog).
I was tasked with screening a chemical database my end goal I’m still trying to determine. I had a pick of the litter, which database do I choose? the classic ZincDB, ChemBL, etc. or one of the harder ones the Enamine REAL DB. Of course, I went for the latter — the challenge is more fun.
Okay, we are looking 1.36 Billion compounds split into 20 parts
.smiles files each containing roughly ~68 Million compounds.
Ah Log files — gotta love them and hate them. During my rotation project for the PhD I have to often check these logs files for the progress of molecular dynamic simulations (I’m using GROMACS btw). These simulations can take days to run which means I have to do a daily check usually 3 times a day
Now this task is very cumbersome…..because if I have 12 simulations running which means 12 log files being produced
md_3.log(standard log file for GROMACS) and this is rotation so imagine after that. So to do my daily check here’s the time breakdown
So here I am, First Year PhD Student, and it’s been a daunting process to get into but if you’re like me and have these similar requirements then this blog is for you!
Note: For utility functions like
initiate_universe I’m gonna include it at the bottom for folk but to make the initial code cleaner :).
If you’ve come across this article, you must be a python-dev (like myself) that has come across MDAnalysis but reading over the docs you’ve discovered you know nothing about molecular dynamics. So let’s start at the basics: Take the distance between two atoms in space over a trajectory.
I’ve got two atoms in space first I’m going to load up my topology and trajectory using
mda and initiate the universe (sounds so cool):
def initiate_universe(topology, trajectory):
universe = mda.Universe(topology, trajectory)
return universeif __name__ == '__main__': # File Setup current_directory = os.getcwd() topology = os.path.join(current_directory, 'topology.psf') trajectory = os.path.join(current_directory…
So I’ve entered the realm of molecular dynamics and I am constantly hearing this term “Force Fields”. The term has felt so loose and not really quantifiable perhaps dawning on magic…there is something going on under the hood that I think can only really be discovered by looking at the raw data.
Here’s a result of that….
Since I joined the University of Maryland (UM) cohort figured I would check out what the folk in this new city have on the block.
CHARMM is one of these force fields that hangs in the molecular dynamics space and stems from my…
I’ve started my journey into the world of molecular dynamics, and so I’m getting used to the file formats in this field of science.
I needed to convert my
.gro files into
.pdb and this seemed to be tricky to implement until I came across this package called MDAnalysis. It’s a pretty nifty package and is pip installable!
So to start off import
MDAnalysis and initiate a
universe with your Gromacs file as the path.
import MDAnalysis as mda
universe = mda.Universe(path_to_gromacs_file)
Next, we will use the
Writer object from
MDAnalysis with the output file specified with the extension
Recently, I wanted to add a feature to MolPDF where not only the
SMILES , 2D images are being produced I can also provide
IUPAC names for the “minable” pdfs.
IUPAC is the language used most commonly by wet lab chemists, whereas
SMILES is more of a 1D string representation of a molecule used by machines. Increasingly, it has become more pertinent to bridge the gap between the two and I figured it would be useful in the context of what MolPDF was designed to do. So let us do some digging at the current state of things…
We’ve got the…