Debugging File Encoding and File Path Distribution Bugs for GlobalChem with Github Action Bots.

Sulstice
3 min readMay 23, 2022

--

Good distribution is a science and it has been the most painful thing to learn when trying to get people to use my data and tools I create to prove it. I’ve run into some pretty hard bugs that I felt were not so easy to fix.

The code: https://github.com/Sulstice/global-chem

I’ve been distributing this python package with PyPi on MacOS 12 Monterey, Linux x86_64, Windows 11. The Github Action bot looks like this applying a matrix of python versions and operating systems:

The GlobalChem Action Bot Script is quite simple:

name: GlobalChem API
on: [pull_request, push, workflow_dispatch]
jobs:
test:
runs-on: ${{ matrix.os }}
strategy:
fail-fast: true
matrix:
os: ["ubuntu-latest", "windows-latest", "macos-latest"]
python-version: ["3.7", "3.8", "3.9"]
steps:
- name: Checkout source
uses: actions/checkout@v2

- name: Setup python
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
architecture: x64

- name: Install
run: |
python global_chem/setup.py install
pip install python-coveralls
pip install coveralls
pip install coverage==4.5.4
pip install nose
- name: Run GlobalChem tests
run: |
cd global_chem
python -m nose --verbose --with-coverage -s -w tests/

So this is how I pretty much test my system. And over time it really has helped me maintain the software, but also to my dismay knowledge can be a fickle thing where you know it’s not working. This bug sucked.

File Encoding

So here is the traceback:

==============================================================
ERROR: Test the deep layer network graphs
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\hostedtoolcache\windows\Python\3.8.10\x64\lib\site-packages\nose\case.py", line 198, in runTest
self.test(*self.arg)
File "D:\a\global-chem\global- chem\global_chem\tests\test_global_chem.py", line 224, in test_deep_layer_networks
gc.print_deep_network()
File "D:\a\global-chem\global-chem\global_chem\global_chem\global_chem.py", line [11]
27, in print_deep_network
print(PrintTreeUtilities.printTrees(_DEEP_NETWORK_KEY[ self.root_node ]))
File "D:\a\global-chem\global-chem\global_chem\global_chem\global_chem.py", line 277, in printTrees
print("\n".join(get_repr(node)))
File "C:\hostedtoolcache\windows\Python\3.8.10\x64\lib\encodings\cp[12]
52.py", line [19], in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u[25] character maps to <undefined>

Well that’s weird apparently on the “windows-latest” it can’t print my graphs to the terminal. The character in question is this : something I was using to make a pretty terminal output.

I don’t want to lose this feature or lose my users. I went down this path of changing the character to something that fits. This also happened recently, where I had support and it was working but all of a sudden it just stopped.

Well it turns out, that the encoding might not be completely enforced for a windows user when running the script for standard in and standard out. Basically, I needed to tell the windows machine that the encoding for this particular package is utf-8 . I added these two lines to the MasterClass:

sys.stdin.reconfigure(encoding='utf-8')
sys.stdout.reconfigure(encoding='utf-8')

Fixed!

File Paths

Here is the traceback:

======================================================================
ERROR: Test the Building of the GlobalChem Network
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\hostedtoolcache\windows\Python\3.8.10\x64\lib\site-packages\nose\case.py", line 198, in runTest
self.test(*self.arg)
File "D:\a\global-chem\global-chem\global_chem\tests\test_global_chem.py", line 57, in test_build_global_chem_network
molecules = gc.get_node_smiles('emerging_perfluoroalkyls')
File "D:\a\global-chem\global-chem\global_chem\global_chem\global_chem.py", line 664, in get_node_smiles
raise GraphNetworkError(
global_chem.global_chem.GraphNetworkError: No Node named emerging_perfluoroalkyls exists

It couldn’t build the network. So it couldn’t fetch any of the nodes. Building the network relies on file paths.

I didn’t realize that Linux/Mac share similar file path strings where for Windows they install the letter for the hard drive or storage space followed by colon and for Github Actions they split the file path tree with “\\” instead of Linux/Mac “/”

C:\\path\\to\\file
/path/to/file

To resolve this first make a check for Windows operation system in your code:

self.splitter = '/' 
if os.name == 'nt':
self.splitter = '\\'

And then call the file naturally so:

absolute_file_path = self.splitter.join(os.path.abspath(__file__).split(self.splitter))

And voila done. This roughly took me a couple of weeks to figure out since I was relying on the Github Action Bots and had to keep testing in my spare time until one fateful last Saturday I was just like fuck it.

Distribution on larger scale is tough and is definitely a science, I hope we can all figure it out the more we document all these bugs.

--

--

Sulstice
Sulstice

No responses yet