multipart.py
author Rémi Cardona <remi.cardona@logilab.fr>
Fri, 18 Sep 2015 12:34:51 +0200
changeset 10709 3ada28a113f8
parent 10662 10942ed172de
child 10741 6666e52d8dd9
permissions -rw-r--r--
[py3k] dict methods no longer return indexable objects
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     1
# -*- coding: utf-8 -*-
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     2
'''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     3
Parser for multipart/form-data
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     4
==============================
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     5
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     6
This module provides a parser for the multipart/form-data format. It can read
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     7
from a file, a socket or a WSGI environment. The parser can be used to replace
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     8
cgi.FieldStorage (without the bugs) and works with Python 2.5+ and 3.x (2to3).
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     9
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    10
Licence (MIT)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    11
-------------
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    12
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    13
    Copyright (c) 2010, Marcel Hellkamp.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    14
    Inspired by the Werkzeug library: http://werkzeug.pocoo.org/
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    15
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    16
    Permission is hereby granted, free of charge, to any person obtaining a copy
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    17
    of this software and associated documentation files (the "Software"), to deal
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    18
    in the Software without restriction, including without limitation the rights
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    19
    to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    20
    copies of the Software, and to permit persons to whom the Software is
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    21
    furnished to do so, subject to the following conditions:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    22
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    23
    The above copyright notice and this permission notice shall be included in
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    24
    all copies or substantial portions of the Software.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    25
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    26
    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    27
    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    28
    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    29
    AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    30
    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    31
    OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    32
    THE SOFTWARE.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    33
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    34
'''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    35
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    36
__author__ = 'Marcel Hellkamp'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    37
__version__ = '0.1'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    38
__license__ = 'MIT'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    39
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    40
from tempfile import TemporaryFile
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    41
from wsgiref.headers import Headers
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    42
import re, sys
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    43
try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    44
    from io import BytesIO
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    45
except ImportError: # pragma: no cover (fallback for Python 2.5)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    46
    from StringIO import StringIO as BytesIO
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    47
10603
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 9946
diff changeset
    48
from six.moves.urllib.parse import parse_qs
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 9946
diff changeset
    49
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    50
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    51
################################ Helper & Misc ################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    52
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    53
# Some of these were copied from bottle: http://bottle.paws.de/
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    54
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    55
try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    56
    from collections import MutableMapping as DictMixin
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    57
except ImportError: # pragma: no cover (fallback for Python 2.5)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    58
    from UserDict import DictMixin
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    59
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    60
class MultiDict(DictMixin):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    61
    """ A dict that remembers old values for each key """
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    62
    def __init__(self, *a, **k):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    63
        self.dict = dict()
10662
10942ed172de [py3k] dict.iteritems → dict.items
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10603
diff changeset
    64
        for k, v in dict(*a, **k).items():
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    65
            self[k] = v
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    66
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    67
    def __len__(self): return len(self.dict)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    68
    def __iter__(self): return iter(self.dict)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    69
    def __contains__(self, key): return key in self.dict
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    70
    def __delitem__(self, key): del self.dict[key]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    71
    def keys(self): return self.dict.keys()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    72
    def __getitem__(self, key): return self.get(key, KeyError, -1)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    73
    def __setitem__(self, key, value): self.append(key, value)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    74
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    75
    def append(self, key, value): self.dict.setdefault(key, []).append(value)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    76
    def replace(self, key, value): self.dict[key] = [value]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    77
    def getall(self, key): return self.dict.get(key) or []
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    78
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    79
    def get(self, key, default=None, index=-1):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    80
        if key not in self.dict and default != KeyError:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    81
            return [default][index]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    82
        return self.dict[key][index]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    83
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    84
    def iterallitems(self):
10662
10942ed172de [py3k] dict.iteritems → dict.items
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10603
diff changeset
    85
        for key, values in self.dict.items():
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    86
            for value in values:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    87
                yield key, value
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    88
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    89
def tob(data, enc='utf8'): # Convert strings to bytes (py2 and py3)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    90
    return data.encode(enc) if isinstance(data, unicode) else data
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    91
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    92
def copy_file(stream, target, maxread=-1, buffer_size=2*16):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    93
    ''' Read from :stream and write to :target until :maxread or EOF. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    94
    size, read = 0, stream.read
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    95
    while 1:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    96
        to_read = buffer_size if maxread < 0 else min(buffer_size, maxread-size)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    97
        part = read(to_read)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    98
        if not part: return size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    99
        target.write(part)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   100
        size += len(part)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   101
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   102
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   103
################################ Header Parser ################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   104
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   105
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   106
_special = re.escape('()<>@,;:\\"/[]?={} \t')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   107
_re_special = re.compile('[%s]' % _special)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   108
_qstr = '"(?:\\\\.|[^"])*"' # Quoted string
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   109
_value = '(?:[^%s]+|%s)' % (_special, _qstr) # Save or quoted string
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   110
_option = '(?:;|^)\s*([^%s]+)\s*=\s*(%s)' % (_special, _value)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   111
_re_option = re.compile(_option) # key=value part of an Content-Type like header
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   112
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   113
def header_quote(val):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   114
    if not _re_special.search(val):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   115
        return val
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   116
    return '"' + val.replace('\\','\\\\').replace('"','\\"') + '"'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   117
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   118
def header_unquote(val, filename=False):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   119
    if val[0] == val[-1] == '"':
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   120
        val = val[1:-1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   121
        if val[1:3] == ':\\' or val[:2] == '\\\\': 
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   122
            val = val.split('\\')[-1] # fix ie6 bug: full path --> filename
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   123
        return val.replace('\\\\','\\').replace('\\"','"')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   124
    return val
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   125
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   126
def parse_options_header(header, options=None):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   127
    if ';' not in header:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   128
        return header.lower().strip(), {}
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   129
    ctype, tail = header.split(';', 1)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   130
    options = options or {}
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   131
    for match in _re_option.finditer(tail):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   132
        key = match.group(1).lower()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   133
        value = header_unquote(match.group(2), key=='filename')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   134
        options[key] = value
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   135
    return ctype, options
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   136
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   137
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   138
################################## Multipart ##################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   139
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   140
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   141
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   142
class MultipartError(ValueError): pass
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   143
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   144
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   145
class MultipartParser(object):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   146
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   147
    def __init__(self, stream, boundary, content_length=-1,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   148
                 disk_limit=2**30, mem_limit=2**20, memfile_limit=2**18,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   149
                 buffer_size=2**16, charset='latin1'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   150
        ''' Parse a multipart/form-data byte stream. This object is an iterator
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   151
            over the parts of the message.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   152
            
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   153
            :param stream: A file-like stream. Must implement ``.read(size)``.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   154
            :param boundary: The multipart boundary as a byte string.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   155
            :param content_length: The maximum number of bytes to read.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   156
        '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   157
        self.stream, self.boundary = stream, boundary
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   158
        self.content_length = content_length
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   159
        self.disk_limit = disk_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   160
        self.memfile_limit = memfile_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   161
        self.mem_limit = min(mem_limit, self.disk_limit)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   162
        self.buffer_size = min(buffer_size, self.mem_limit)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   163
        self.charset = charset
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   164
        if self.buffer_size - 6 < len(boundary): # "--boundary--\r\n"
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   165
            raise MultipartError('Boundary does not fit into buffer_size.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   166
        self._done = []
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   167
        self._part_iter = None
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   168
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   169
    def __iter__(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   170
        ''' Iterate over the parts of the multipart message. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   171
        if not self._part_iter:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   172
            self._part_iter = self._iterparse()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   173
        for part in self._done:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   174
            yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   175
        for part in self._part_iter:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   176
            self._done.append(part)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   177
            yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   178
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   179
    def parts(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   180
        ''' Returns a list with all parts of the multipart message. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   181
        return list(iter(self))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   182
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   183
    def get(self, name, default=None):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   184
        ''' Return the first part with that name or a default value (None). '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   185
        for part in self:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   186
            if name == part.name:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   187
                return part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   188
        return default
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   189
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   190
    def get_all(self, name):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   191
        ''' Return a list of parts with that name. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   192
        return [p for p in self if p.name == name]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   193
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   194
    def _lineiter(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   195
        ''' Iterate over a binary file-like object line by line. Each line is
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   196
            returned as a (line, line_ending) tuple. If the line does not fit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   197
            into self.buffer_size, line_ending is empty and the rest of the line
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   198
            is returned with the next iteration.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   199
        '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   200
        read = self.stream.read
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   201
        maxread, maxbuf = self.content_length, self.buffer_size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   202
        _bcrnl = tob('\r\n')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   203
        _bcr = _bcrnl[:1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   204
        _bnl = _bcrnl[1:]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   205
        _bempty = _bcrnl[:0] # b'rn'[:0] -> b''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   206
        buffer = _bempty # buffer for the last (partial) line
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   207
        while 1:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   208
            data = read(maxbuf if maxread < 0 else min(maxbuf, maxread))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   209
            maxread -= len(data)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   210
            lines = (buffer+data).splitlines(True)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   211
            len_first_line = len(lines[0])
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   212
            # be sure that the first line does not become too big
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   213
            if len_first_line > self.buffer_size:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   214
                # at the same time don't split a '\r\n' accidentally
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   215
                if (len_first_line == self.buffer_size+1 and
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   216
                    lines[0].endswith(_bcrnl)):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   217
                    splitpos = self.buffer_size - 1
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   218
                else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   219
                    splitpos = self.buffer_size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   220
                lines[:1] = [lines[0][:splitpos],
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   221
                             lines[0][splitpos:]]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   222
            if data:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   223
                buffer = lines[-1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   224
                lines = lines[:-1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   225
            for line in lines:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   226
                if line.endswith(_bcrnl): yield line[:-2], _bcrnl
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   227
                elif line.endswith(_bnl): yield line[:-1], _bnl
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   228
                elif line.endswith(_bcr): yield line[:-1], _bcr
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   229
                else:                     yield line, _bempty
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   230
            if not data:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   231
                break
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   232
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   233
    def _iterparse(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   234
        lines, line = self._lineiter(), ''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   235
        separator = tob('--') + tob(self.boundary)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   236
        terminator = tob('--') + tob(self.boundary) + tob('--')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   237
        # Consume first boundary. Ignore leading blank lines
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   238
        for line, nl in lines:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   239
            if line: break
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   240
        if line != separator:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   241
            raise MultipartError("Stream does not start with boundary")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   242
        # For each part in stream...
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   243
        mem_used, disk_used = 0, 0 # Track used resources to prevent DoS
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   244
        is_tail = False # True if the last line was incomplete (cutted)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   245
        opts = {'buffer_size': self.buffer_size,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   246
                'memfile_limit': self.memfile_limit,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   247
                'charset': self.charset}
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   248
        part = MultipartPart(**opts)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   249
        for line, nl in lines:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   250
            if line == terminator and not is_tail:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   251
                part.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   252
                yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   253
                break
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   254
            elif line == separator and not is_tail:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   255
                if part.is_buffered(): mem_used  += part.size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   256
                else:                  disk_used += part.size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   257
                part.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   258
                yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   259
                part = MultipartPart(**opts)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   260
            else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   261
                is_tail = not nl # The next line continues this one
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   262
                part.feed(line, nl)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   263
                if part.is_buffered():
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   264
                    if part.size + mem_used > self.mem_limit:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   265
                        raise MultipartError("Memory limit reached.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   266
                elif part.size + disk_used > self.disk_limit:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   267
                    raise MultipartError("Disk limit reached.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   268
        if line != terminator:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   269
            raise MultipartError("Unexpected end of multipart stream.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   270
            
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   271
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   272
class MultipartPart(object):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   273
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   274
    def __init__(self, buffer_size=2**16, memfile_limit=2**18, charset='latin1'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   275
        self.headerlist = []
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   276
        self.headers = None
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   277
        self.file = False
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   278
        self.size = 0
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   279
        self._buf = tob('')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   280
        self.disposition, self.name, self.filename = None, None, None
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   281
        self.content_type, self.charset = None, charset
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   282
        self.memfile_limit = memfile_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   283
        self.buffer_size = buffer_size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   284
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   285
    def feed(self, line, nl=''):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   286
        if self.file:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   287
            return self.write_body(line, nl)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   288
        return self.write_header(line, nl)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   289
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   290
    def write_header(self, line, nl):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   291
        line = line.decode(self.charset or 'latin1')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   292
        if not nl: raise MultipartError('Unexpected end of line in header.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   293
        if not line.strip(): # blank line -> end of header segment
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   294
            self.finish_header()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   295
        elif line[0] in ' \t' and self.headerlist:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   296
            name, value = self.headerlist.pop()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   297
            self.headerlist.append((name, value+line.strip()))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   298
        else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   299
            if ':' not in line:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   300
                raise MultipartError("Syntax error in header: No colon.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   301
            name, value = line.split(':', 1)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   302
            self.headerlist.append((name.strip(), value.strip()))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   303
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   304
    def write_body(self, line, nl):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   305
        if not line and not nl: return # This does not even flush the buffer
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   306
        self.size += len(line) + len(self._buf)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   307
        self.file.write(self._buf + line)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   308
        self._buf = nl
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   309
        if self.content_length > 0 and self.size > self.content_length:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   310
            raise MultipartError('Size of body exceeds Content-Length header.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   311
        if self.size > self.memfile_limit and isinstance(self.file, BytesIO):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   312
            # TODO: What about non-file uploads that exceed the memfile_limit?
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   313
            self.file, old = TemporaryFile(mode='w+b'), self.file
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   314
            old.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   315
            copy_file(old, self.file, self.size, self.buffer_size)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   316
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   317
    def finish_header(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   318
        self.file = BytesIO()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   319
        self.headers = Headers(self.headerlist)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   320
        cdis = self.headers.get('Content-Disposition','')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   321
        ctype = self.headers.get('Content-Type','')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   322
        clen = self.headers.get('Content-Length','-1')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   323
        if not cdis:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   324
            raise MultipartError('Content-Disposition header is missing.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   325
        self.disposition, self.options = parse_options_header(cdis)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   326
        self.name = self.options.get('name')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   327
        self.filename = self.options.get('filename')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   328
        self.content_type, options = parse_options_header(ctype)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   329
        self.charset = options.get('charset') or self.charset
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   330
        self.content_length = int(self.headers.get('Content-Length','-1'))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   331
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   332
    def is_buffered(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   333
        ''' Return true if the data is fully buffered in memory.'''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   334
        return isinstance(self.file, BytesIO)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   335
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   336
    @property
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   337
    def value(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   338
        ''' Data decoded with the specified charset '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   339
        pos = self.file.tell()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   340
        self.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   341
        val = self.file.read()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   342
        self.file.seek(pos)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   343
        return val.decode(self.charset)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   344
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   345
    def save_as(self, path):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   346
        fp = open(path, 'wb')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   347
        pos = self.file.tell()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   348
        try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   349
            self.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   350
            size = copy_file(self.file, fp)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   351
        finally:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   352
            self.file.seek(pos)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   353
        return size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   354
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   355
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   356
#################################### WSGI ####################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   357
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   358
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   359
def parse_form_data(environ, charset='utf8', strict=False, **kw):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   360
    ''' Parse form data from an environ dict and return a (forms, files) tuple.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   361
        Both tuple values are dictionaries with the form-field name as a key
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   362
        (unicode) and lists as values (multiple values per key are possible).
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   363
        The forms-dictionary contains form-field values as unicode strings.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   364
        The files-dictionary contains :class:`MultipartPart` instances, either
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   365
        because the form-field was a file-upload or the value is to big to fit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   366
        into memory limits.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   367
        
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   368
        :param environ: An WSGI environment dict.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   369
        :param charset: The charset to use if unsure. (default: utf8)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   370
        :param strict: If True, raise :exc:`MultipartError` on any parsing
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   371
                       errors. These are silently ignored by default.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   372
    '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   373
        
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   374
    forms, files = MultiDict(), MultiDict()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   375
    try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   376
        if environ.get('REQUEST_METHOD','GET').upper() not in ('POST', 'PUT'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   377
            raise MultipartError("Request method other than POST or PUT.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   378
        content_length = int(environ.get('CONTENT_LENGTH', '-1'))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   379
        content_type = environ.get('CONTENT_TYPE', '')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   380
        if not content_type:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   381
            raise MultipartError("Missing Content-Type header.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   382
        content_type, options = parse_options_header(content_type)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   383
        stream = environ.get('wsgi.input') or BytesIO()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   384
        kw['charset'] = charset = options.get('charset', charset)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   385
        if content_type == 'multipart/form-data':
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   386
            boundary = options.get('boundary','')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   387
            if not boundary:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   388
                raise MultipartError("No boundary for multipart/form-data.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   389
            for part in MultipartParser(stream, boundary, content_length, **kw):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   390
                if part.filename or not part.is_buffered():
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   391
                    files[part.name] = part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   392
                else: # TODO: Big form-fields are in the files dict. really?
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   393
                    forms[part.name] = part.value
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   394
        elif content_type in ('application/x-www-form-urlencoded',
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   395
                              'application/x-url-encoded'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   396
            mem_limit = kw.get('mem_limit', 2**20)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   397
            if content_length > mem_limit:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   398
                raise MultipartError("Request to big. Increase MAXMEM.")
9946
ec88c1a1904a [wsgi] Fix unicode decoding in POST
Christophe de Vienne <christophe@unlish.com>
parents: 9735
diff changeset
   399
            data = stream.read(mem_limit)
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   400
            if stream.read(1): # These is more that does not fit mem_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   401
                raise MultipartError("Request to big. Increase MAXMEM.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   402
            data = parse_qs(data, keep_blank_values=True)
10662
10942ed172de [py3k] dict.iteritems → dict.items
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10603
diff changeset
   403
            for key, values in data.items():
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   404
                for value in values:
9946
ec88c1a1904a [wsgi] Fix unicode decoding in POST
Christophe de Vienne <christophe@unlish.com>
parents: 9735
diff changeset
   405
                    forms[key] = value.decode(charset)
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   406
        else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   407
            raise MultipartError("Unsupported content type.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   408
    except MultipartError:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   409
        if strict: raise
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   410
    return forms, files
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   411