REGEXVAULTv2.0
Dev & Systems/Log Parsing
Verified Safe

Apache/Nginx Combined Log Format Regex for Python

/^(\S+) \S+ (\S+) \[([^\]]+)\] "([A-Z]+) ([^"]+) HTTP/[\d.]+" ([1-5]\d{2}) (\d+|-)(?:\s"([^"]*)" "([^"]*)")?/

What this pattern does

This page provides a comprehensive, battle-tested regular expression for matching apache/nginx combined log format, ported and verified for Python. A rigorously tested regex reduces debugging time and protects your application from edge-case failures. The snippet below is ready to drop into your Python project — whether you're validating in a Django view, a FastAPI endpoint, or a standalone data processing script.

Python Implementation

Python
# Apache/Nginx Combined Log Format
# ReDoS-safe | RegexVault — Dev & Systems > Log Parsing

import re

apachenginx_combined_log_format_pattern = re.compile(r'^(\S+) \S+ (\S+) \[([^\]]+)\] "([A-Z]+) ([^"]+) HTTP/[\d.]+" ([1-5]\d{2}) (\d+|-)(?:\s"([^"]*)" "([^"]*)")?')

def validate_apachenginx_combined_log_format(value: str) -> bool:
    return bool(apachenginx_combined_log_format_pattern.fullmatch(value))

# Example
print(validate_apachenginx_combined_log_format("192.168.1.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/5.0""))  # True

Test Cases

Matches (Valid)
Rejects (Invalid)
192.168.1.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/5.0"not a log line
10.0.0.1 - - [01/Jan/2024:00:00:00 +0000] "POST /api/v1/users HTTP/1.1" 201 450192.168.1.1 [date] GET / 200
plain text log

When to use this pattern

This pattern is drawn from the Dev & Systems > Log Parsing category and carries a ReDoS-safe certification. That matters for Python developers because particularly important in Python web servers where CPU-bound regex operations can stall concurrent request handling. RegexVault audits patterns against known backtracking attack vectors, ensuring you have the necessary context before using this regex in a high-stakes production environment.

Common Pitfalls

Log lines with escaped quotes (\" inside request field) will break naive parsers. Always handle escaped quotes inside quoted fields.

Technical Notes

Groups: 1=client IP, 2=auth user, 3=datetime, 4=method, 5=path, 6=status, 7=bytes, 8=referer, 9=user-agent. The - placeholder indicates missing values.

Have a pattern that belongs in the vault?

Submit it for review — community-verified patterns get credited to your GitHub handle. Free submissions join the queue. Priority review available for $15.

Submit a Pattern