from llms_txt import *
Python module & CLI
Given an llms.txt
file, this provides a CLI and Python API to parse the file and create an XML context file from it. The input file should follow this format:
# FastHTML
> FastHTML is a python library which...
When writing FastHTML apps remember to:
- Thing to remember
## Docs
- [Surreal](https://host/README.md): Tiny jQuery alternative with Locality of Behavior
- [FastHTML quick start](https://host/quickstart.html.md): An overview of FastHTML features
## Examples
- [Todo app](https://host/adv_app.py)
## Optional
- [Starlette docs](https://host/starlette-sml.md): A subset of the Starlette docs
Install
pip install llms-txt
How to use
CLI
After installation, llms_txt2ctx
is available in your terminal.
To get help for the CLI:
llms_txt2ctx -h
To convert an llms.txt
file to XML context and save to llms.md
:
llms_txt2ctx llms.txt > llms.md
Pass --optional True
to add the ‘optional’ section of the input file.
Python module
= Path('llms-sample.txt').read_text() samp
Use parse_llms_file
to create a data structure with the sections of an llms.txt file (you can also add optional=True
if needed):
= parse_llms_file(samp)
parsed list(parsed)
['title', 'summary', 'info', 'sections']
parsed.title,parsed.summary
('FastHTML',
'FastHTML is a python library which brings together Starlette, Uvicorn, HTMX, and fastcore\'s `FT` "FastTags" into a library for creating server-rendered hypermedia applications.')
list(parsed.sections)
['Docs', 'Examples', 'Optional']
0] parsed.sections.Optional[
{ 'desc': 'A subset of the Starlette documentation useful for FastHTML '
'development.',
'title': 'Starlette full documentation',
'url': 'https://gist.githubusercontent.com/jph00/809e4a4808d4510be0e3dc9565e9cbd3/raw/9b717589ca44cedc8aaf00b2b8cacef922964c0f/starlette-sml.md'}
Use create_ctx
to create an LLM context file with XML sections, suitable for systems such as Claude (this is what the CLI calls behind the scenes).
= create_ctx(samp) ctx
print(ctx[:300])
<project title="FastHTML" summary='FastHTML is a python library which brings together Starlette, Uvicorn, HTMX, and fastcore's `FT` "FastTags" into a library for creating server-rendered hypermedia applications.'>
Remember:
- Use `serve()` for running uvicorn (`if __name__ == "__main__"` is not
Implementation and tests
To show how simple it is to parse llms.txt
files, here’s a complete parser in <20 lines of code with no dependencies:
from pathlib import Path
import re,itertools
def chunked(it, chunk_sz):
= iter(it)
it return iter(lambda: list(itertools.islice(it, chunk_sz)), [])
def parse_llms_txt(txt):
"Parse llms.txt file contents in `txt` to a `dict`"
def _p(links):
= '-\s*\[(?P<title>[^\]]+)\]\((?P<url>[^\)]+)\)(?::\s*(?P<desc>.*))?'
link_pat return [re.search(link_pat, l).groupdict()
for l in re.split(r'\n+', links.strip()) if l.strip()]
*rest = re.split(fr'^##\s*(.*?$)', txt, flags=re.MULTILINE)
start,= {k: _p(v) for k,v in dict(chunked(rest, 2)).items()}
sects = '^#\s*(?P<title>.+?$)\n+(?:^>\s*(?P<summary>.+?$)$)?\n+(?P<info>.*)'
pat = re.search(pat, start.strip(), (re.MULTILINE|re.DOTALL)).groupdict()
d 'sections'] = sects
d[return d
We have provided a test suite in tests/test-parse.py
and confirmed that this implementation passes all tests.