cubicweb/multipart.py
author Laurent Peuch <cortex@worlddomination.be>
Thu, 19 Dec 2019 06:35:54 +0100
changeset 12833 c69421aece24
parent 12579 ce089224954d
permissions -rw-r--r--
[mypy] type AppObject.category
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     1
# -*- coding: utf-8 -*-
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     2
'''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     3
Parser for multipart/form-data
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     4
==============================
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     5
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     6
This module provides a parser for the multipart/form-data format. It can read
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     7
from a file, a socket or a WSGI environment. The parser can be used to replace
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     8
cgi.FieldStorage (without the bugs) and works with Python 2.5+ and 3.x (2to3).
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     9
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    10
Licence (MIT)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    11
-------------
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    12
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    13
    Copyright (c) 2010, Marcel Hellkamp.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    14
    Inspired by the Werkzeug library: http://werkzeug.pocoo.org/
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    15
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    16
    Permission is hereby granted, free of charge, to any person obtaining a copy
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    17
    of this software and associated documentation files (the "Software"), to deal
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    18
    in the Software without restriction, including without limitation the rights
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    19
    to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    20
    copies of the Software, and to permit persons to whom the Software is
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    21
    furnished to do so, subject to the following conditions:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    22
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    23
    The above copyright notice and this permission notice shall be included in
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    24
    all copies or substantial portions of the Software.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    25
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    26
    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    27
    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    28
    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    29
    AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    30
    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    31
    OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    32
    THE SOFTWARE.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    33
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    34
'''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    35
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    36
__author__ = 'Marcel Hellkamp'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    37
__version__ = '0.1'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    38
__license__ = 'MIT'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    39
12567
26744ad37953 Drop python2 support
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11057
diff changeset
    40
from io import BytesIO
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    41
from tempfile import TemporaryFile
12567
26744ad37953 Drop python2 support
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11057
diff changeset
    42
from urllib.parse import parse_qs
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    43
from wsgiref.headers import Headers
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    44
import re, sys
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    45
10603
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 9946
diff changeset
    46
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    47
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    48
################################ Helper & Misc ################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    49
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    50
# Some of these were copied from bottle: http://bottle.paws.de/
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    51
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    52
try:
12579
ce089224954d Fix DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated
Nsukami Patrick <ndkpatt at gmail dot com>
parents: 12578
diff changeset
    53
    from collections.abc import MutableMapping as DictMixin
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    54
except ImportError: # pragma: no cover (fallback for Python 2.5)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    55
    from UserDict import DictMixin
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    56
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    57
class MultiDict(DictMixin):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    58
    """ A dict that remembers old values for each key """
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    59
    def __init__(self, *a, **k):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    60
        self.dict = dict()
10662
10942ed172de [py3k] dict.iteritems → dict.items
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10603
diff changeset
    61
        for k, v in dict(*a, **k).items():
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    62
            self[k] = v
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    63
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    64
    def __len__(self): return len(self.dict)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    65
    def __iter__(self): return iter(self.dict)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    66
    def __contains__(self, key): return key in self.dict
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    67
    def __delitem__(self, key): del self.dict[key]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    68
    def keys(self): return self.dict.keys()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    69
    def __getitem__(self, key): return self.get(key, KeyError, -1)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    70
    def __setitem__(self, key, value): self.append(key, value)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    71
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    72
    def append(self, key, value): self.dict.setdefault(key, []).append(value)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    73
    def replace(self, key, value): self.dict[key] = [value]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    74
    def getall(self, key): return self.dict.get(key) or []
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    75
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    76
    def get(self, key, default=None, index=-1):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    77
        if key not in self.dict and default != KeyError:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    78
            return [default][index]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    79
        return self.dict[key][index]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    80
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    81
    def iterallitems(self):
10662
10942ed172de [py3k] dict.iteritems → dict.items
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10603
diff changeset
    82
        for key, values in self.dict.items():
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    83
            for value in values:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    84
                yield key, value
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    85
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    86
def tob(data, enc='utf8'): # Convert strings to bytes (py2 and py3)
12567
26744ad37953 Drop python2 support
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11057
diff changeset
    87
    return data.encode(enc) if isinstance(data, str) else data
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    88
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    89
def copy_file(stream, target, maxread=-1, buffer_size=2*16):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    90
    ''' Read from :stream and write to :target until :maxread or EOF. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    91
    size, read = 0, stream.read
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    92
    while 1:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    93
        to_read = buffer_size if maxread < 0 else min(buffer_size, maxread-size)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    94
        part = read(to_read)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    95
        if not part: return size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    96
        target.write(part)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    97
        size += len(part)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    98
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    99
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   100
################################ Header Parser ################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   101
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   102
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   103
_special = re.escape('()<>@,;:\\"/[]?={} \t')
12578
d06a2feae373 Fix DeprecationWarning: invalid escape sequence
Nsukami Patrick <ndkpatt at gmail dot com>
parents: 12567
diff changeset
   104
_re_special = re.compile(r'[%s]' % _special)
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   105
_qstr = '"(?:\\\\.|[^"])*"' # Quoted string
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   106
_value = '(?:[^%s]+|%s)' % (_special, _qstr) # Save or quoted string
12578
d06a2feae373 Fix DeprecationWarning: invalid escape sequence
Nsukami Patrick <ndkpatt at gmail dot com>
parents: 12567
diff changeset
   107
_option = r'(?:;|^)\s*([^%s]+)\s*=\s*(%s)' % (_special, _value)
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   108
_re_option = re.compile(_option) # key=value part of an Content-Type like header
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   109
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   110
def header_quote(val):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   111
    if not _re_special.search(val):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   112
        return val
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   113
    return '"' + val.replace('\\','\\\\').replace('"','\\"') + '"'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   114
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   115
def header_unquote(val, filename=False):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   116
    if val[0] == val[-1] == '"':
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   117
        val = val[1:-1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   118
        if val[1:3] == ':\\' or val[:2] == '\\\\': 
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   119
            val = val.split('\\')[-1] # fix ie6 bug: full path --> filename
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   120
        return val.replace('\\\\','\\').replace('\\"','"')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   121
    return val
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   122
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   123
def parse_options_header(header, options=None):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   124
    if ';' not in header:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   125
        return header.lower().strip(), {}
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   126
    ctype, tail = header.split(';', 1)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   127
    options = options or {}
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   128
    for match in _re_option.finditer(tail):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   129
        key = match.group(1).lower()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   130
        value = header_unquote(match.group(2), key=='filename')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   131
        options[key] = value
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   132
    return ctype, options
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   133
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   134
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   135
################################## Multipart ##################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   136
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   137
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   138
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   139
class MultipartError(ValueError): pass
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   140
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   141
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   142
class MultipartParser(object):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   143
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   144
    def __init__(self, stream, boundary, content_length=-1,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   145
                 disk_limit=2**30, mem_limit=2**20, memfile_limit=2**18,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   146
                 buffer_size=2**16, charset='latin1'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   147
        ''' Parse a multipart/form-data byte stream. This object is an iterator
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   148
            over the parts of the message.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   149
            
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   150
            :param stream: A file-like stream. Must implement ``.read(size)``.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   151
            :param boundary: The multipart boundary as a byte string.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   152
            :param content_length: The maximum number of bytes to read.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   153
        '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   154
        self.stream, self.boundary = stream, boundary
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   155
        self.content_length = content_length
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   156
        self.disk_limit = disk_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   157
        self.memfile_limit = memfile_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   158
        self.mem_limit = min(mem_limit, self.disk_limit)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   159
        self.buffer_size = min(buffer_size, self.mem_limit)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   160
        self.charset = charset
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   161
        if self.buffer_size - 6 < len(boundary): # "--boundary--\r\n"
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   162
            raise MultipartError('Boundary does not fit into buffer_size.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   163
        self._done = []
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   164
        self._part_iter = None
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   165
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   166
    def __iter__(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   167
        ''' Iterate over the parts of the multipart message. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   168
        if not self._part_iter:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   169
            self._part_iter = self._iterparse()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   170
        for part in self._done:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   171
            yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   172
        for part in self._part_iter:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   173
            self._done.append(part)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   174
            yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   175
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   176
    def parts(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   177
        ''' Returns a list with all parts of the multipart message. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   178
        return list(iter(self))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   179
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   180
    def get(self, name, default=None):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   181
        ''' Return the first part with that name or a default value (None). '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   182
        for part in self:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   183
            if name == part.name:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   184
                return part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   185
        return default
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   186
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   187
    def get_all(self, name):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   188
        ''' Return a list of parts with that name. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   189
        return [p for p in self if p.name == name]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   190
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   191
    def _lineiter(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   192
        ''' Iterate over a binary file-like object line by line. Each line is
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   193
            returned as a (line, line_ending) tuple. If the line does not fit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   194
            into self.buffer_size, line_ending is empty and the rest of the line
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   195
            is returned with the next iteration.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   196
        '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   197
        read = self.stream.read
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   198
        maxread, maxbuf = self.content_length, self.buffer_size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   199
        _bcrnl = tob('\r\n')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   200
        _bcr = _bcrnl[:1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   201
        _bnl = _bcrnl[1:]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   202
        _bempty = _bcrnl[:0] # b'rn'[:0] -> b''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   203
        buffer = _bempty # buffer for the last (partial) line
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   204
        while 1:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   205
            data = read(maxbuf if maxread < 0 else min(maxbuf, maxread))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   206
            maxread -= len(data)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   207
            lines = (buffer+data).splitlines(True)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   208
            len_first_line = len(lines[0])
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   209
            # be sure that the first line does not become too big
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   210
            if len_first_line > self.buffer_size:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   211
                # at the same time don't split a '\r\n' accidentally
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   212
                if (len_first_line == self.buffer_size+1 and
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   213
                    lines[0].endswith(_bcrnl)):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   214
                    splitpos = self.buffer_size - 1
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   215
                else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   216
                    splitpos = self.buffer_size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   217
                lines[:1] = [lines[0][:splitpos],
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   218
                             lines[0][splitpos:]]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   219
            if data:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   220
                buffer = lines[-1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   221
                lines = lines[:-1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   222
            for line in lines:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   223
                if line.endswith(_bcrnl): yield line[:-2], _bcrnl
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   224
                elif line.endswith(_bnl): yield line[:-1], _bnl
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   225
                elif line.endswith(_bcr): yield line[:-1], _bcr
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   226
                else:                     yield line, _bempty
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   227
            if not data:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   228
                break
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   229
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   230
    def _iterparse(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   231
        lines, line = self._lineiter(), ''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   232
        separator = tob('--') + tob(self.boundary)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   233
        terminator = tob('--') + tob(self.boundary) + tob('--')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   234
        # Consume first boundary. Ignore leading blank lines
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   235
        for line, nl in lines:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   236
            if line: break
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   237
        if line != separator:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   238
            raise MultipartError("Stream does not start with boundary")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   239
        # For each part in stream...
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   240
        mem_used, disk_used = 0, 0 # Track used resources to prevent DoS
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   241
        is_tail = False # True if the last line was incomplete (cutted)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   242
        opts = {'buffer_size': self.buffer_size,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   243
                'memfile_limit': self.memfile_limit,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   244
                'charset': self.charset}
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   245
        part = MultipartPart(**opts)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   246
        for line, nl in lines:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   247
            if line == terminator and not is_tail:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   248
                part.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   249
                yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   250
                break
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   251
            elif line == separator and not is_tail:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   252
                if part.is_buffered(): mem_used  += part.size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   253
                else:                  disk_used += part.size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   254
                part.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   255
                yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   256
                part = MultipartPart(**opts)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   257
            else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   258
                is_tail = not nl # The next line continues this one
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   259
                part.feed(line, nl)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   260
                if part.is_buffered():
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   261
                    if part.size + mem_used > self.mem_limit:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   262
                        raise MultipartError("Memory limit reached.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   263
                elif part.size + disk_used > self.disk_limit:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   264
                    raise MultipartError("Disk limit reached.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   265
        if line != terminator:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   266
            raise MultipartError("Unexpected end of multipart stream.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   267
            
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   268
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   269
class MultipartPart(object):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   270
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   271
    def __init__(self, buffer_size=2**16, memfile_limit=2**18, charset='latin1'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   272
        self.headerlist = []
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   273
        self.headers = None
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   274
        self.file = False
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   275
        self.size = 0
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   276
        self._buf = tob('')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   277
        self.disposition, self.name, self.filename = None, None, None
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   278
        self.content_type, self.charset = None, charset
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   279
        self.memfile_limit = memfile_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   280
        self.buffer_size = buffer_size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   281
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   282
    def feed(self, line, nl=''):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   283
        if self.file:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   284
            return self.write_body(line, nl)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   285
        return self.write_header(line, nl)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   286
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   287
    def write_header(self, line, nl):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   288
        line = line.decode(self.charset or 'latin1')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   289
        if not nl: raise MultipartError('Unexpected end of line in header.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   290
        if not line.strip(): # blank line -> end of header segment
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   291
            self.finish_header()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   292
        elif line[0] in ' \t' and self.headerlist:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   293
            name, value = self.headerlist.pop()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   294
            self.headerlist.append((name, value+line.strip()))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   295
        else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   296
            if ':' not in line:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   297
                raise MultipartError("Syntax error in header: No colon.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   298
            name, value = line.split(':', 1)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   299
            self.headerlist.append((name.strip(), value.strip()))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   300
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   301
    def write_body(self, line, nl):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   302
        if not line and not nl: return # This does not even flush the buffer
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   303
        self.size += len(line) + len(self._buf)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   304
        self.file.write(self._buf + line)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   305
        self._buf = nl
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   306
        if self.content_length > 0 and self.size > self.content_length:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   307
            raise MultipartError('Size of body exceeds Content-Length header.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   308
        if self.size > self.memfile_limit and isinstance(self.file, BytesIO):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   309
            # TODO: What about non-file uploads that exceed the memfile_limit?
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   310
            self.file, old = TemporaryFile(mode='w+b'), self.file
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   311
            old.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   312
            copy_file(old, self.file, self.size, self.buffer_size)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   313
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   314
    def finish_header(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   315
        self.file = BytesIO()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   316
        self.headers = Headers(self.headerlist)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   317
        cdis = self.headers.get('Content-Disposition','')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   318
        ctype = self.headers.get('Content-Type','')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   319
        clen = self.headers.get('Content-Length','-1')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   320
        if not cdis:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   321
            raise MultipartError('Content-Disposition header is missing.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   322
        self.disposition, self.options = parse_options_header(cdis)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   323
        self.name = self.options.get('name')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   324
        self.filename = self.options.get('filename')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   325
        self.content_type, options = parse_options_header(ctype)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   326
        self.charset = options.get('charset') or self.charset
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   327
        self.content_length = int(self.headers.get('Content-Length','-1'))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   328
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   329
    def is_buffered(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   330
        ''' Return true if the data is fully buffered in memory.'''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   331
        return isinstance(self.file, BytesIO)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   332
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   333
    @property
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   334
    def value(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   335
        ''' Data decoded with the specified charset '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   336
        pos = self.file.tell()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   337
        self.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   338
        val = self.file.read()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   339
        self.file.seek(pos)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   340
        return val.decode(self.charset)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   341
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   342
    def save_as(self, path):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   343
        fp = open(path, 'wb')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   344
        pos = self.file.tell()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   345
        try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   346
            self.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   347
            size = copy_file(self.file, fp)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   348
        finally:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   349
            self.file.seek(pos)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   350
        return size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   351
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   352
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   353
#################################### WSGI ####################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   354
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   355
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   356
def parse_form_data(environ, charset='utf8', strict=False, **kw):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   357
    ''' Parse form data from an environ dict and return a (forms, files) tuple.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   358
        Both tuple values are dictionaries with the form-field name as a key
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   359
        (unicode) and lists as values (multiple values per key are possible).
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   360
        The forms-dictionary contains form-field values as unicode strings.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   361
        The files-dictionary contains :class:`MultipartPart` instances, either
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   362
        because the form-field was a file-upload or the value is to big to fit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   363
        into memory limits.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   364
        
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   365
        :param environ: An WSGI environment dict.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   366
        :param charset: The charset to use if unsure. (default: utf8)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   367
        :param strict: If True, raise :exc:`MultipartError` on any parsing
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   368
                       errors. These are silently ignored by default.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   369
    '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   370
        
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   371
    forms, files = MultiDict(), MultiDict()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   372
    try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   373
        if environ.get('REQUEST_METHOD','GET').upper() not in ('POST', 'PUT'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   374
            raise MultipartError("Request method other than POST or PUT.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   375
        content_length = int(environ.get('CONTENT_LENGTH', '-1'))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   376
        content_type = environ.get('CONTENT_TYPE', '')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   377
        if not content_type:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   378
            raise MultipartError("Missing Content-Type header.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   379
        content_type, options = parse_options_header(content_type)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   380
        stream = environ.get('wsgi.input') or BytesIO()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   381
        kw['charset'] = charset = options.get('charset', charset)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   382
        if content_type == 'multipart/form-data':
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   383
            boundary = options.get('boundary','')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   384
            if not boundary:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   385
                raise MultipartError("No boundary for multipart/form-data.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   386
            for part in MultipartParser(stream, boundary, content_length, **kw):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   387
                if part.filename or not part.is_buffered():
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   388
                    files[part.name] = part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   389
                else: # TODO: Big form-fields are in the files dict. really?
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   390
                    forms[part.name] = part.value
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   391
        elif content_type in ('application/x-www-form-urlencoded',
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   392
                              'application/x-url-encoded'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   393
            mem_limit = kw.get('mem_limit', 2**20)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   394
            if content_length > mem_limit:
10746
407385314c0d [multipart] decode form data before calling parse_qs in python 3
Julien Cristau <julien.cristau@logilab.fr>
parents: 10745
diff changeset
   395
                raise MultipartError("Request too big. Increase MAXMEM.")
9946
ec88c1a1904a [wsgi] Fix unicode decoding in POST
Christophe de Vienne <christophe@unlish.com>
parents: 9735
diff changeset
   396
            data = stream.read(mem_limit)
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   397
            if stream.read(1): # These is more that does not fit mem_limit
10746
407385314c0d [multipart] decode form data before calling parse_qs in python 3
Julien Cristau <julien.cristau@logilab.fr>
parents: 10745
diff changeset
   398
                raise MultipartError("Request too big. Increase MAXMEM.")
12567
26744ad37953 Drop python2 support
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11057
diff changeset
   399
            data = data.decode('ascii')
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   400
            data = parse_qs(data, keep_blank_values=True)
10662
10942ed172de [py3k] dict.iteritems → dict.items
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10603
diff changeset
   401
            for key, values in data.items():
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   402
                for value in values:
12567
26744ad37953 Drop python2 support
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11057
diff changeset
   403
                    forms[key] = value
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   404
        else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   405
            raise MultipartError("Unsupported content type.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   406
    except MultipartError:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   407
        if strict: raise
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   408
    return forms, files