As per RFC 7231 §18.104.22.168:
A recipient that parses a timestamp value in an HTTP header field MUST accept all three HTTP-date formats.
These formats are then described as (with the first being the only preferred format; the latter 2 are designated as "obsolete"), converted for this post into
%a, %d %b %Y %H:%M:%S %Z
with the timezone always given as "GMT", but to be interpreted as UTC.
%A, %d-%b-%y %H:%M:%S %Z
where the timezone may be equal to any of an array of "standard" abbreviations from RFC 850 §2.1.4.
%a %b %-d %H:%M:%S %Y
where you simply pray/assume that the remote server is operating in UTC.
strptime function does not support timezones: it eats them with
%Z, but does not actually use them. Therefore, we will have to hack this support in ourselves. The
pytz module is indispensible for parsing them, so we will appreciate/use it. (We additionally have to crack open
re because, depressingly,
strptime does not even make available to us that which it matched as
So, a Python function to parse an HTTP Date header into a datetime object would be something like:
from datetime import datetime from pytz import timezone, utc import re def parse_http_date(date): try: imf1 = '%a, %d %b %Y %H:%M:%S GMT' return datetime.strptime(date, imf1).replace(tzinfo=utc) except ValueError: try: rfc850 = '%A, %d-%b-%y %H:%M:%S %Z' tzname = re.fullmatch(r'((\w+), (\d+)-(\w+)-(\d+) (\d+):(\d+):(\d+)) (.+)', s).group(9) if tzname == 'GMT': tzname = 'UTC' return datetime.strptime(date, rfc850).replace(tzinfo=timezone(tzname)) except (ValueError, TypeError): pass try: asctime = '%a %b %-d %H:%M:%S %Y' return datetime.strptime(date, asctime).replace(tzinfo=utc) except ValueError: pass # Neither of the "obsolete" formats worked, so re-raise original strptime error from preferred format raise
If you don't care about parsing obsolete formats, this can be reduced to:
from datetime import datetime from datetime import timezone as tz def parse_http_date(date): imf1 = '%a, %d %b %Y %H:%M:%S GMT' return datetime.strptime(date, imf1).replace(tzinfo=tz.utc)
…which only uses the standard library!