cubicweb/multipart.py
author Denis Laxalde <denis.laxalde@logilab.fr>
Fri, 05 Apr 2019 17:58:19 +0200
changeset 12567 26744ad37953
parent 11057 0b59724cb3f2
child 12578 d06a2feae373
permissions -rw-r--r--
Drop python2 support This mostly consists in removing the dependency on "six" and updating the code to use only Python3 idioms. Notice that we previously used TemporaryDirectory from cubicweb.devtools.testlib for compatibility with Python2. We now directly import it from tempfile.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     1
# -*- coding: utf-8 -*-
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     2
'''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     3
Parser for multipart/form-data
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     4
==============================
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     5
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     6
This module provides a parser for the multipart/form-data format. It can read
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     7
from a file, a socket or a WSGI environment. The parser can be used to replace
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     8
cgi.FieldStorage (without the bugs) and works with Python 2.5+ and 3.x (2to3).
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     9
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    10
Licence (MIT)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    11
-------------
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    12
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    13
    Copyright (c) 2010, Marcel Hellkamp.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    14
    Inspired by the Werkzeug library: http://werkzeug.pocoo.org/
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    15
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    16
    Permission is hereby granted, free of charge, to any person obtaining a copy
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    17
    of this software and associated documentation files (the "Software"), to deal
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    18
    in the Software without restriction, including without limitation the rights
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    19
    to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    20
    copies of the Software, and to permit persons to whom the Software is
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    21
    furnished to do so, subject to the following conditions:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    22
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    23
    The above copyright notice and this permission notice shall be included in
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    24
    all copies or substantial portions of the Software.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    25
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    26
    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    27
    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    28
    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    29
    AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    30
    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    31
    OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    32
    THE SOFTWARE.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    33
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    34
'''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    35
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    36
__author__ = 'Marcel Hellkamp'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    37
__version__ = '0.1'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    38
__license__ = 'MIT'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    39
12567
26744ad37953 Drop python2 support
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11057
diff changeset
    40
from io import BytesIO
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    41
from tempfile import TemporaryFile
12567
26744ad37953 Drop python2 support
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11057
diff changeset
    42
from urllib.parse import parse_qs
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    43
from wsgiref.headers import Headers
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    44
import re, sys
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    45
10603
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 9946
diff changeset
    46
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    47
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    48
################################ Helper & Misc ################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    49
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    50
# Some of these were copied from bottle: http://bottle.paws.de/
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    51
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    52
try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    53
    from collections import MutableMapping as DictMixin
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    54
except ImportError: # pragma: no cover (fallback for Python 2.5)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    55
    from UserDict import DictMixin
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    56
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    57
class MultiDict(DictMixin):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    58
    """ A dict that remembers old values for each key """
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    59
    def __init__(self, *a, **k):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    60
        self.dict = dict()
10662
10942ed172de [py3k] dict.iteritems → dict.items
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10603
diff changeset
    61
        for k, v in dict(*a, **k).items():
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    62
            self[k] = v
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    63
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    64
    def __len__(self): return len(self.dict)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    65
    def __iter__(self): return iter(self.dict)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    66
    def __contains__(self, key): return key in self.dict
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    67
    def __delitem__(self, key): del self.dict[key]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    68
    def keys(self): return self.dict.keys()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    69
    def __getitem__(self, key): return self.get(key, KeyError, -1)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    70
    def __setitem__(self, key, value): self.append(key, value)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    71
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    72
    def append(self, key, value): self.dict.setdefault(key, []).append(value)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    73
    def replace(self, key, value): self.dict[key] = [value]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    74
    def getall(self, key): return self.dict.get(key) or []
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    75
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    76
    def get(self, key, default=None, index=-1):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    77
        if key not in self.dict and default != KeyError:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    78
            return [default][index]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    79
        return self.dict[key][index]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    80
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    81
    def iterallitems(self):
10662
10942ed172de [py3k] dict.iteritems → dict.items
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10603
diff changeset
    82
        for key, values in self.dict.items():
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    83
            for value in values:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    84
                yield key, value
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    85
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    86
def tob(data, enc='utf8'): # Convert strings to bytes (py2 and py3)
12567
26744ad37953 Drop python2 support
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11057
diff changeset
    87
    return data.encode(enc) if isinstance(data, str) else data
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    88
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    89
def copy_file(stream, target, maxread=-1, buffer_size=2*16):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    90
    ''' Read from :stream and write to :target until :maxread or EOF. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    91
    size, read = 0, stream.read
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    92
    while 1:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    93
        to_read = buffer_size if maxread < 0 else min(buffer_size, maxread-size)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    94
        part = read(to_read)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    95
        if not part: return size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    96
        target.write(part)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    97
        size += len(part)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    98
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    99
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   100
################################ Header Parser ################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   101
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   102
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   103
_special = re.escape('()<>@,;:\\"/[]?={} \t')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   104
_re_special = re.compile('[%s]' % _special)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   105
_qstr = '"(?:\\\\.|[^"])*"' # Quoted string
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   106
_value = '(?:[^%s]+|%s)' % (_special, _qstr) # Save or quoted string
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   107
_option = '(?:;|^)\s*([^%s]+)\s*=\s*(%s)' % (_special, _value)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   108
_re_option = re.compile(_option) # key=value part of an Content-Type like header
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   109
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   110
def header_quote(val):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   111
    if not _re_special.search(val):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   112
        return val
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   113
    return '"' + val.replace('\\','\\\\').replace('"','\\"') + '"'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   114
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   115
def header_unquote(val, filename=False):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   116
    if val[0] == val[-1] == '"':
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   117
        val = val[1:-1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   118
        if val[1:3] == ':\\' or val[:2] == '\\\\': 
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   119
            val = val.split('\\')[-1] # fix ie6 bug: full path --> filename
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   120
        return val.replace('\\\\','\\').replace('\\"','"')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   121
    return val
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   122
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   123
def parse_options_header(header, options=None):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   124
    if ';' not in header:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   125
        return header.lower().strip(), {}
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   126
    ctype, tail = header.split(';', 1)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   127
    options = options or {}
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   128
    for match in _re_option.finditer(tail):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   129
        key = match.group(1).lower()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   130
        value = header_unquote(match.group(2), key=='filename')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   131
        options[key] = value
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   132
    return ctype, options
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   133
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   134
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   135
################################## Multipart ##################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   136
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   137
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   138
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   139
class MultipartError(ValueError): pass
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   140
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   141
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   142
class MultipartParser(object):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   143
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   144
    def __init__(self, stream, boundary, content_length=-1,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   145
                 disk_limit=2**30, mem_limit=2**20, memfile_limit=2**18,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   146
                 buffer_size=2**16, charset='latin1'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   147
        ''' Parse a multipart/form-data byte stream. This object is an iterator
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   148
            over the parts of the message.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   149
            
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   150
            :param stream: A file-like stream. Must implement ``.read(size)``.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   151
            :param boundary: The multipart boundary as a byte string.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   152
            :param content_length: The maximum number of bytes to read.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   153
        '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   154
        self.stream, self.boundary = stream, boundary
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   155
        self.content_length = content_length
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   156
        self.disk_limit = disk_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   157
        self.memfile_limit = memfile_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   158
        self.mem_limit = min(mem_limit, self.disk_limit)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   159
        self.buffer_size = min(buffer_size, self.mem_limit)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   160
        self.charset = charset
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   161
        if self.buffer_size - 6 < len(boundary): # "--boundary--\r\n"
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   162
            raise MultipartError('Boundary does not fit into buffer_size.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   163
        self._done = []
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   164
        self._part_iter = None
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   165
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   166
    def __iter__(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   167
        ''' Iterate over the parts of the multipart message. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   168
        if not self._part_iter:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   169
            self._part_iter = self._iterparse()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   170
        for part in self._done:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   171
            yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   172
        for part in self._part_iter:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   173
            self._done.append(part)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   174
            yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   175
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   176
    def parts(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   177
        ''' Returns a list with all parts of the multipart message. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   178
        return list(iter(self))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   179
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   180
    def get(self, name, default=None):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   181
        ''' Return the first part with that name or a default value (None). '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   182
        for part in self:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   183
            if name == part.name:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   184
                return part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   185
        return default
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   186
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   187
    def get_all(self, name):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   188
        ''' Return a list of parts with that name. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   189
        return [p for p in self if p.name == name]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   190
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   191
    def _lineiter(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   192
        ''' Iterate over a binary file-like object line by line. Each line is
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   193
            returned as a (line, line_ending) tuple. If the line does not fit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   194
            into self.buffer_size, line_ending is empty and the rest of the line
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   195
            is returned with the next iteration.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   196
        '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   197
        read = self.stream.read
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   198
        maxread, maxbuf = self.content_length, self.buffer_size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   199
        _bcrnl = tob('\r\n')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   200
        _bcr = _bcrnl[:1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   201
        _bnl = _bcrnl[1:]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   202
        _bempty = _bcrnl[:0] # b'rn'[:0] -> b''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   203
        buffer = _bempty # buffer for the last (partial) line
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   204
        while 1:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   205
            data = read(maxbuf if maxread < 0 else min(maxbuf, maxread))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   206
            maxread -= len(data)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   207
            lines = (buffer+data).splitlines(True)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   208
            len_first_line = len(lines[0])
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   209
            # be sure that the first line does not become too big
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   210
            if len_first_line > self.buffer_size:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   211
                # at the same time don't split a '\r\n' accidentally
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   212
                if (len_first_line == self.buffer_size+1 and
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   213
                    lines[0].endswith(_bcrnl)):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   214
                    splitpos = self.buffer_size - 1
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   215
                else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   216
                    splitpos = self.buffer_size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   217
                lines[:1] = [lines[0][:splitpos],
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   218
                             lines[0][splitpos:]]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   219
            if data:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   220
                buffer = lines[-1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   221
                lines = lines[:-1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   222
            for line in lines:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   223
                if line.endswith(_bcrnl): yield line[:-2], _bcrnl
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   224
                elif line.endswith(_bnl): yield line[:-1], _bnl
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   225
                elif line.endswith(_bcr): yield line[:-1], _bcr
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   226
                else:                     yield line, _bempty
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   227
            if not data:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   228
                break
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   229
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   230
    def _iterparse(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   231
        lines, line = self._lineiter(), ''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   232
        separator = tob('--') + tob(self.boundary)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   233
        terminator = tob('--') + tob(self.boundary) + tob('--')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   234
        # Consume first boundary. Ignore leading blank lines
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   235
        for line, nl in lines:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   236
            if line: break
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   237
        if line != separator:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   238
            raise MultipartError("Stream does not start with boundary")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   239
        # For each part in stream...
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   240
        mem_used, disk_used = 0, 0 # Track used resources to prevent DoS
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   241
        is_tail = False # True if the last line was incomplete (cutted)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   242
        opts = {'buffer_size': self.buffer_size,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   243
                'memfile_limit': self.memfile_limit,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   244
                'charset': self.charset}
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   245
        part = MultipartPart(**opts)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   246
        for line, nl in lines:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   247
            if line == terminator and not is_tail:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   248
                part.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   249
                yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   250
                break
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   251
            elif line == separator and not is_tail:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   252
                if part.is_buffered(): mem_used  += part.size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   253
                else:                  disk_used += part.size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   254
                part.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   255
                yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   256
                part = MultipartPart(**opts)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   257
            else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   258
                is_tail = not nl # The next line continues this one
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   259
                part.feed(line, nl)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   260
                if part.is_buffered():
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   261
                    if part.size + mem_used > self.mem_limit:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   262
                        raise MultipartError("Memory limit reached.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   263
                elif part.size + disk_used > self.disk_limit:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   264
                    raise MultipartError("Disk limit reached.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   265
        if line != terminator:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   266
            raise MultipartError("Unexpected end of multipart stream.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   267
            
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   268
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   269
class MultipartPart(object):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   270
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   271
    def __init__(self, buffer_size=2**16, memfile_limit=2**18, charset='latin1'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   272
        self.headerlist = []
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   273
        self.headers = None
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   274
        self.file = False
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   275
        self.size = 0
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   276
        self._buf = tob('')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   277
        self.disposition, self.name, self.filename = None, None, None
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   278
        self.content_type, self.charset = None, charset
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   279
        self.memfile_limit = memfile_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   280
        self.buffer_size = buffer_size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   281
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   282
    def feed(self, line, nl=''):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   283
        if self.file:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   284
            return self.write_body(line, nl)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   285
        return self.write_header(line, nl)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   286
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   287
    def write_header(self, line, nl):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   288
        line = line.decode(self.charset or 'latin1')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   289
        if not nl: raise MultipartError('Unexpected end of line in header.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   290
        if not line.strip(): # blank line -> end of header segment
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   291
            self.finish_header()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   292
        elif line[0] in ' \t' and self.headerlist:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   293
            name, value = self.headerlist.pop()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   294
            self.headerlist.append((name, value+line.strip()))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   295
        else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   296
            if ':' not in line:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   297
                raise MultipartError("Syntax error in header: No colon.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   298
            name, value = line.split(':', 1)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   299
            self.headerlist.append((name.strip(), value.strip()))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   300
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   301
    def write_body(self, line, nl):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   302
        if not line and not nl: return # This does not even flush the buffer
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   303
        self.size += len(line) + len(self._buf)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   304
        self.file.write(self._buf + line)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   305
        self._buf = nl
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   306
        if self.content_length > 0 and self.size > self.content_length:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   307
            raise MultipartError('Size of body exceeds Content-Length header.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   308
        if self.size > self.memfile_limit and isinstance(self.file, BytesIO):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   309
            # TODO: What about non-file uploads that exceed the memfile_limit?
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   310
            self.file, old = TemporaryFile(mode='w+b'), self.file
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   311
            old.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   312
            copy_file(old, self.file, self.size, self.buffer_size)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   313
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   314
    def finish_header(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   315
        self.file = BytesIO()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   316
        self.headers = Headers(self.headerlist)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   317
        cdis = self.headers.get('Content-Disposition','')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   318
        ctype = self.headers.get('Content-Type','')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   319
        clen = self.headers.get('Content-Length','-1')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   320
        if not cdis:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   321
            raise MultipartError('Content-Disposition header is missing.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   322
        self.disposition, self.options = parse_options_header(cdis)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   323
        self.name = self.options.get('name')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   324
        self.filename = self.options.get('filename')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   325
        self.content_type, options = parse_options_header(ctype)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   326
        self.charset = options.get('charset') or self.charset
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   327
        self.content_length = int(self.headers.get('Content-Length','-1'))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   328
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   329
    def is_buffered(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   330
        ''' Return true if the data is fully buffered in memory.'''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   331
        return isinstance(self.file, BytesIO)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   332
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   333
    @property
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   334
    def value(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   335
        ''' Data decoded with the specified charset '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   336
        pos = self.file.tell()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   337
        self.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   338
        val = self.file.read()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   339
        self.file.seek(pos)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   340
        return val.decode(self.charset)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   341
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   342
    def save_as(self, path):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   343
        fp = open(path, 'wb')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   344
        pos = self.file.tell()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   345
        try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   346
            self.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   347
            size = copy_file(self.file, fp)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   348
        finally:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   349
            self.file.seek(pos)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   350
        return size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   351
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   352
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   353
#################################### WSGI ####################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   354
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   355
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   356
def parse_form_data(environ, charset='utf8', strict=False, **kw):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   357
    ''' Parse form data from an environ dict and return a (forms, files) tuple.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   358
        Both tuple values are dictionaries with the form-field name as a key
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   359
        (unicode) and lists as values (multiple values per key are possible).
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   360
        The forms-dictionary contains form-field values as unicode strings.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   361
        The files-dictionary contains :class:`MultipartPart` instances, either
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   362
        because the form-field was a file-upload or the value is to big to fit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   363
        into memory limits.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   364
        
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   365
        :param environ: An WSGI environment dict.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   366
        :param charset: The charset to use if unsure. (default: utf8)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   367
        :param strict: If True, raise :exc:`MultipartError` on any parsing
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   368
                       errors. These are silently ignored by default.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   369
    '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   370
        
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   371
    forms, files = MultiDict(), MultiDict()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   372
    try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   373
        if environ.get('REQUEST_METHOD','GET').upper() not in ('POST', 'PUT'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   374
            raise MultipartError("Request method other than POST or PUT.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   375
        content_length = int(environ.get('CONTENT_LENGTH', '-1'))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   376
        content_type = environ.get('CONTENT_TYPE', '')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   377
        if not content_type:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   378
            raise MultipartError("Missing Content-Type header.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   379
        content_type, options = parse_options_header(content_type)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   380
        stream = environ.get('wsgi.input') or BytesIO()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   381
        kw['charset'] = charset = options.get('charset', charset)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   382
        if content_type == 'multipart/form-data':
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   383
            boundary = options.get('boundary','')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   384
            if not boundary:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   385
                raise MultipartError("No boundary for multipart/form-data.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   386
            for part in MultipartParser(stream, boundary, content_length, **kw):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   387
                if part.filename or not part.is_buffered():
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   388
                    files[part.name] = part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   389
                else: # TODO: Big form-fields are in the files dict. really?
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   390
                    forms[part.name] = part.value
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   391
        elif content_type in ('application/x-www-form-urlencoded',
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   392
                              'application/x-url-encoded'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   393
            mem_limit = kw.get('mem_limit', 2**20)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   394
            if content_length > mem_limit:
10746
407385314c0d [multipart] decode form data before calling parse_qs in python 3
Julien Cristau <julien.cristau@logilab.fr>
parents: 10745
diff changeset
   395
                raise MultipartError("Request too big. Increase MAXMEM.")
9946
ec88c1a1904a [wsgi] Fix unicode decoding in POST
Christophe de Vienne <christophe@unlish.com>
parents: 9735
diff changeset
   396
            data = stream.read(mem_limit)
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   397
            if stream.read(1): # These is more that does not fit mem_limit
10746
407385314c0d [multipart] decode form data before calling parse_qs in python 3
Julien Cristau <julien.cristau@logilab.fr>
parents: 10745
diff changeset
   398
                raise MultipartError("Request too big. Increase MAXMEM.")
12567
26744ad37953 Drop python2 support
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11057
diff changeset
   399
            data = data.decode('ascii')
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   400
            data = parse_qs(data, keep_blank_values=True)
10662
10942ed172de [py3k] dict.iteritems → dict.items
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10603
diff changeset
   401
            for key, values in data.items():
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   402
                for value in values:
12567
26744ad37953 Drop python2 support
Denis Laxalde <denis.laxalde@logilab.fr>
parents: 11057
diff changeset
   403
                    forms[key] = value
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   404
        else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   405
            raise MultipartError("Unsupported content type.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   406
    except MultipartError:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   407
        if strict: raise
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   408
    return forms, files