ADR-012: Manual HTTP Parsing¶
Estado: ✅ Accepted Date: 2026-01-29 Deciders: Rodrigo Roldán
Context¶
Para parsear HTTP responses, opciones:
Manual parsing: Implementar parser propio
Stdlib http.client: Usar cliente HTTP de stdlib
External library: Usar h11, httpcore, etc.
Consideraciones:
ADR-001 establece zero dependencies
stdlib
http.clientes sync-only y algo limitadoManual parsing da control total
Decision¶
Implementar parser HTTP/1.1 manual en reqivo.http.http11.
Características del parser:
class HttpParser:
"""HTTP/1.1 response parser."""
def parse_response(self, data: bytes) -> tuple[str, Headers, bytes]:
"""
Parse raw HTTP response.
Returns: (status_line, headers, body)
Raises: InvalidResponseError si malformed
"""
# Parse status line
# Parse headers
# Extract body
Principios de implementación:
✅ RFC 7230-7235 compliance (HTTP/1.1)
✅ Robusto contra malformed responses
✅ Límites configurables (headers size, body size)
✅ Error messages claros
✅ Sin regex complejo (simple string operations)
Scope:
✅ Status line parsing
✅ Header parsing (case-insensitive)
✅ Chunked transfer encoding
✅ Content-Length body reading
✅ Duplicate headers
❌ NO multiline headers (obs-fold obsoleto)
❌ NO auto-decompression (gzip, br) en v0.1.x
Consequences¶
Positive ✅¶
Control total: Customizar parsing a necesidades
Zero deps: Cumple ADR-001
Debuggeable: Código propio es más fácil debuggear
Optimizable: Podemos optimizar bottlenecks
Educational: Entendemos HTTP en profundidad
Negative ❌¶
Más código: Parser no trivial (~150 LOC)
Bugs potenciales: Implementación nueva tiene riesgo
Mantenimiento: Debemos mantener compliance con RFC
Edge cases: Servidores raros pueden romper parser
Mitigations¶
Tests exhaustivos: 97%+ coverage en parser
RFC compliance tests: Test against known responses
Fuzzing: (futuro) Fuzzing para edge cases
Limits: Límites evitan DoS
Implementation Notes¶
Status line parsing:
# HTTP/1.1 200 OK
status_line, rest = data.split(b"\r\n", 1)
http_version, status_code, reason = status_line.split(b" ", 2)
Header parsing:
# Header: Value\r\n
# Header: Value\r\n
# \r\n ← End of headers
headers_end = data.find(b"\r\n\r\n")
headers_text = data[:headers_end]
for line in headers_text.split(b"\r\n"):
key, value = line.split(b": ", 1)
headers[key.decode()] = value.decode()
Duplicate headers:
# Set-Cookie: a=1
# Set-Cookie: b=2
# → headers["Set-Cookie"] = ["a=1", "b=2"]
Security Considerations¶
Header injection prevention:
# Verificar que headers no contengan \r\n
if "\r\n" in header_value:
raise ValueError("Header injection attempt")
Size limits (v0.2.0):
class HttpParser:
def __init__(self, max_header_size: int = 8192, max_body_size: int = 10_000_000):
self.max_header_size = max_header_size
self.max_body_size = max_body_size
Alternatives Considered¶
http.client: Rejected. Sync-only, menos control.
h11: Rejected. External dependency.
Regex-based: Rejected. Más lento, más complejo.
References¶
RFC 7230: HTTP/1.1 Message Syntax
RFC 7231: HTTP/1.1 Semantics