Code Link: Google Colab
Recently with my current research in other areas I got into mixing fuel systems for engines. I love cars (typical) and going fast so this really interesting in my daily life as well.
Jet Fuels are classified in 3 categories:
- Paraffin — Straight or Branched Chain Hydrocarbons.
- Napthene-Hydrocarbons in a ring structure
- Aromatic-Ring systems that have the chemcial property of “aromaticity”.
The data is being abstracted from this paper. I wrote the SMILES into Global-Chem to prepare for it the Principal Component Analysis (PCA)

Since the authors have declared 3 categories, that’s the paramaters I put into the PCA code:
from global_chem import GlobalChem
from global_chem_extensions import GlobalChemExtensions
gc = GlobalChem()
cheminformatics = GlobalChemExtensions().cheminformatics()
gc = GlobalChem()
gc.build_global_chem_network(print_output=False, debugger=False)
smiles_list = list(gc.get_node_smiles('alternative_jet_fuels').values())
mol_ids = cheminformatics.node_pca_analysis(
smiles_list,
morgan_radius = 1,
bit_representation = 512,
number_of_clusters = 3,
number_of_components = 0.95,
random_state = 0,
principal_component_x = 0 ,
principal_component_y = 1 ,
x_axis_label = 'PC1',
y_axis_label = 'PC2',
plot_width = 500,
plot_height = 500,
title = '',
save_file=False,
return_mol_ids=True,
save_principal_components=True,
)
And what do we get:

Looking at this deeper, I realize that 3 classifications perhaps is not enough if we want to explore this region of chemical space. Not highlighted is the yellow which is the straight to branched hydrocarbons and the green in the bond is the aromatics.
So what can we do this information? Well if I was a researcher in materials for fuels. I would mix these systems together and measure the energy from the resultant mix.
I would then start experimenting with fingerprints to predict new alternatives but mainly focused on straight chain alkenes or other cycloalkanes that are not cyclohexane. There is only one cyclopentane functional group in the set. I wonder where something that is more ring strained like cyclobutane would fall under in terms of alternatives.
Alright, that’s it for today. Happy Cheminformatics!