How to interactively polish a Jupyter notebook across file formats
I am writing some new Pytorch tutorials, and one of the requirements is to convert the tutorial .ipynb file into a .py file. This is pretty easy to do using nbconvert.
However, I like using tools such as LLMs and my AutoDocsEditor tool to improve the quality of my writing. I can do this directly on the ipynb files, but it results in diffs that are not human-friendly. So, I like to convert the ipynb to markdown to make it easier to review and edit the diffs. This complicates the workflow a bit
- Develop the tutorial interactively in a Jupyter notebook
- Use AI to polish a markdown version of the notebook, which is easier to review.
- Use
AutoDocsEditorto improve the prose and style (can be done either on the ipynb or md files) - Export the file as a Sphinx-gallery-styled Python script, which is what the Pytorch tutorials repo expects.
- Keep all three of these documents synchronized throughout the process.
Unfortunately, nbconvert only supports 1-way conversions. If you tried multi-way conversions manually, you would lose metadata and outputs, and clobber the document structure.
I found a library that supports exactly my use-case: Jupytext. It even supports the py:sphinx format!
The Jupytext workflow looks like this:
- Install deps:
pip install jupytext nbconvert - Pair the notebook:
jupytext --set-formats ipynb,md,py:sphinx tutorial.ipynb. Jupytext automatically generates the markdown and py files, and understands that these are views of the same document. - You can now edit any of these docs and then sync it to the whole group. Just make sure to edit one file format at a time to prevent losing work.
- To sync from ipynb, use
jupytext --sync tutorial.ipynb. - To sync from md, use
jupytext --sync tutorial.md.
- To sync from ipynb, use
Conveniently, Jupytext even has a Cursor IDE extension.
Gotcha: RST formatting in py:sphinx files
Unfortunately, Jupytext does not automatically convert between markdown-style syntax to RST-style syntax that I need. It does have a sphinx_convert_rst2md option to convert RST -> markdown, but not the other way around.
Our options are to either 1) write in RST format within our ipynb and markdown files, or 2) not use Jupytext to generate our py:sphinx files. I prefer the latter. Luckily, the Pytorch repo linked to a conversion script that does just that. This was a bit cumbersome to run, so I converted it to a one-liner isolated command here. You can run it with uv run https://tools.ricardodecal.com/python/ipynb_to_py_sphinx.py notebook.ipynb.
The workflow here looks like:
- Edit markdown file in idiomatic markdown format
- Run
uv run jupytext --sync foo_tutorial.mdto sync the changes in the md file to the ipynb file - Run
uvx ruff check --fix foo_tutorial.ipynbto polish the code in the notebook - Run
uv run jupytext --sync foo_tutorial.ipynbto sync the polished code back to the md file - Run
uv run https://tools.ricardodecal.com/python/ipynb_to_py_sphinx.py foo_tutorial.ipynbto generate the py:sphinx file - Test that the py file runs:
uv run python foo_tutorial.py