Processing Flow
This page describes the data flow through Docu, from Python file to documentation output.
Overview
The documentation generation process in Docu follows these main steps:
- Input Processing: Parse Python file and extract special comments
- Documentation Extraction: Extract documentation from comments
- Code Analysis: Analyze code structure using AST
- Documentation Processing: Process documentation according to style
- Output Generation: Generate documentation in the requested format
Detailed Flow
┌────────────┐ ┌───────────────┐ ┌──────────────┐
│ Python File│───▶│ AST Parsing │───▶│Comment Parser│
└────────────┘ └───────────────┘ └──────┬───────┘
│
▼
┌──────────────┐ ┌────────────────┐ ┌──────────────┐
│Output Format │◀───│ Documentation │◀───│ Doc Structure│
│ (MD/HTML) │ │ Generator │ │ Extraction │
└──────────────┘ └────────────────┘ └──────────────┘
Step-By-Step Explanation
1. File Reading and AST Parsing
# In parsers.py
def parse_python_file(file_path: str) -> Dict[str, DocItem]:
with open(file_path, 'r', encoding='utf-8') as f:
content = f.read()
# Extract all #/ comments
doc_comments = extract_doc_comments(file_path)
# Parse the Python file using AST
tree = ast.parse(content, filename=file_path)
# Process the AST...
First, Docu reads the Python file and parses it using Python's built-in Abstract Syntax Tree (AST) module. This gives Docu a structured representation of the Python code that can be analyzed.
2. Comment Extraction
# In parsers.py
def extract_doc_comments(file_path: str) -> List[Tuple[int, str]]:
doc_comments: List[Tuple[int, str]] = []
with open(file_path, 'r', encoding='utf-8') as f:
for i, line in enumerate(f, 1):
line = line.strip()
if line.startswith('#/'):
comment = line[2:].strip()
doc_comments.append((i, comment))
return doc_comments
Docu identifies and extracts all lines that start with the special #/
comment marker. These comments contain the documentation that will be processed.
3. Code Structure Analysis
While traversing the AST, Docu identifies:
- Classes
- Functions
- Methods
- Module-level documentation
- Type annotations
- Return types
- Class fields
It associates the special comments with these code elements based on their positions in the file.
4. Documentation Processing
# In doc_parsers.py
class GoogleStyleParser(DocStyleParser):
def parse(self, docstring: str) -> Dict[str, str]:
# Process Google-style docstrings
sections = {'description': '', 'args': [], 'returns': '', 'raises': []}
# Parsing logic...
return sections
The extracted documentation is processed according to the specified style (Google, NumPy, or Sphinx). This extracts structured information like argument descriptions, return value descriptions, and raised exceptions.
5. Documentation Structure Creation
# In parsers.py
doc_item = DocItem(
name=node.name,
doc='\n'.join(item_docs) if item_docs else "",
item_type=item_type,
lineno=node.lineno,
parent=parent,
args=args,
return_type=return_type,
fields=fields
)
Docu creates a structured representation of the documentation using the DocItem
class. This includes all the information about the code element and its documentation.
6. Output Generation
Markdown Generation
# In generators.py
def generate_markdown_docs(doc_items: Dict[str, DocItem]) -> str:
md_content = []
# Generate markdown content from doc_items
return '\n'.join(md_content)
For markdown output, Docu generates a structured markdown document from the DocItem
objects.
HTML Generation
# In generators.py
def generate_html_docs(doc_items: Dict[str, DocItem], template_name: str = 'default', doc_style: str = 'google') -> str:
# Get template manager and parser
template_manager = TemplateManager()
template = template_manager.get_template(template_name)
# Prepare template data
template_data = {
'items': doc_items,
# Additional data...
}
# Render template
return template.render(**template_data)
For HTML output, Docu prepares data for the template and then renders the selected HTML template with this data.
7. Output Handling
# In docgen.py
if output_dir:
os.makedirs(output_dir, exist_ok=True)
output_path = os.path.join(output_dir, output_filename)
with open(output_path, 'w', encoding='utf-8') as f:
f.write(content)
return output_path
return content
Finally, Docu either saves the generated documentation to a file or returns it as a string, depending on whether an output directory was specified.
Error Handling
Throughout the process, Docu includes error handling to deal with:
- Missing files
- Invalid Python syntax
- Unsupported documentation styles
- Missing templates
- File writing errors
Error messages are propagated to the user either through exceptions or through the CLI output.