multipart.py
author Rémi Cardona <remi.cardona@logilab.fr>
Mon, 21 Sep 2015 18:04:22 +0200
changeset 10722 1d0b5aef7523
parent 10662 10942ed172de
child 10741 6666e52d8dd9
permissions -rw-r--r--
[tests] Stop counting pickle bytes in ResultSet tests Between python 2's pickle and cPickle implementations, the various protocol versions added in python 3, the length of the pickled string is completely unpredictable. Instead, do a simple round trip and test a few attributes.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     1
# -*- coding: utf-8 -*-
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     2
'''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     3
Parser for multipart/form-data
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     4
==============================
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     5
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     6
This module provides a parser for the multipart/form-data format. It can read
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     7
from a file, a socket or a WSGI environment. The parser can be used to replace
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     8
cgi.FieldStorage (without the bugs) and works with Python 2.5+ and 3.x (2to3).
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     9
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    10
Licence (MIT)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    11
-------------
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    12
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    13
    Copyright (c) 2010, Marcel Hellkamp.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    14
    Inspired by the Werkzeug library: http://werkzeug.pocoo.org/
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    15
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    16
    Permission is hereby granted, free of charge, to any person obtaining a copy
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    17
    of this software and associated documentation files (the "Software"), to deal
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    18
    in the Software without restriction, including without limitation the rights
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    19
    to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    20
    copies of the Software, and to permit persons to whom the Software is
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    21
    furnished to do so, subject to the following conditions:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    22
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    23
    The above copyright notice and this permission notice shall be included in
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    24
    all copies or substantial portions of the Software.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    25
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    26
    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    27
    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    28
    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    29
    AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    30
    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    31
    OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    32
    THE SOFTWARE.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    33
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    34
'''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    35
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    36
__author__ = 'Marcel Hellkamp'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    37
__version__ = '0.1'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    38
__license__ = 'MIT'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    39
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    40
from tempfile import TemporaryFile
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    41
from wsgiref.headers import Headers
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    42
import re, sys
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    43
try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    44
    from io import BytesIO
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    45
except ImportError: # pragma: no cover (fallback for Python 2.5)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    46
    from StringIO import StringIO as BytesIO
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    47
10603
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 9946
diff changeset
    48
from six.moves.urllib.parse import parse_qs
65ad6980976e [py3k] import URL mangling functions using six.moves
Rémi Cardona <remi.cardona@logilab.fr>
parents: 9946
diff changeset
    49
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    50
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    51
################################ Helper & Misc ################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    52
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    53
# Some of these were copied from bottle: http://bottle.paws.de/
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    54
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    55
try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    56
    from collections import MutableMapping as DictMixin
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    57
except ImportError: # pragma: no cover (fallback for Python 2.5)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    58
    from UserDict import DictMixin
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    59
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    60
class MultiDict(DictMixin):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    61
    """ A dict that remembers old values for each key """
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    62
    def __init__(self, *a, **k):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    63
        self.dict = dict()
10662
10942ed172de [py3k] dict.iteritems → dict.items
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10603
diff changeset
    64
        for k, v in dict(*a, **k).items():
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    65
            self[k] = v
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    66
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    67
    def __len__(self): return len(self.dict)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    68
    def __iter__(self): return iter(self.dict)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    69
    def __contains__(self, key): return key in self.dict
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    70
    def __delitem__(self, key): del self.dict[key]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    71
    def keys(self): return self.dict.keys()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    72
    def __getitem__(self, key): return self.get(key, KeyError, -1)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    73
    def __setitem__(self, key, value): self.append(key, value)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    74
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    75
    def append(self, key, value): self.dict.setdefault(key, []).append(value)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    76
    def replace(self, key, value): self.dict[key] = [value]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    77
    def getall(self, key): return self.dict.get(key) or []
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    78
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    79
    def get(self, key, default=None, index=-1):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    80
        if key not in self.dict and default != KeyError:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    81
            return [default][index]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    82
        return self.dict[key][index]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    83
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    84
    def iterallitems(self):
10662
10942ed172de [py3k] dict.iteritems → dict.items
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10603
diff changeset
    85
        for key, values in self.dict.items():
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    86
            for value in values:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    87
                yield key, value
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    88
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    89
def tob(data, enc='utf8'): # Convert strings to bytes (py2 and py3)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    90
    return data.encode(enc) if isinstance(data, unicode) else data
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    91
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    92
def copy_file(stream, target, maxread=-1, buffer_size=2*16):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    93
    ''' Read from :stream and write to :target until :maxread or EOF. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    94
    size, read = 0, stream.read
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    95
    while 1:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    96
        to_read = buffer_size if maxread < 0 else min(buffer_size, maxread-size)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    97
        part = read(to_read)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    98
        if not part: return size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    99
        target.write(part)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   100
        size += len(part)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   101
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   102
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   103
################################ Header Parser ################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   104
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   105
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   106
_special = re.escape('()<>@,;:\\"/[]?={} \t')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   107
_re_special = re.compile('[%s]' % _special)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   108
_qstr = '"(?:\\\\.|[^"])*"' # Quoted string
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   109
_value = '(?:[^%s]+|%s)' % (_special, _qstr) # Save or quoted string
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   110
_option = '(?:;|^)\s*([^%s]+)\s*=\s*(%s)' % (_special, _value)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   111
_re_option = re.compile(_option) # key=value part of an Content-Type like header
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   112
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   113
def header_quote(val):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   114
    if not _re_special.search(val):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   115
        return val
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   116
    return '"' + val.replace('\\','\\\\').replace('"','\\"') + '"'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   117
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   118
def header_unquote(val, filename=False):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   119
    if val[0] == val[-1] == '"':
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   120
        val = val[1:-1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   121
        if val[1:3] == ':\\' or val[:2] == '\\\\': 
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   122
            val = val.split('\\')[-1] # fix ie6 bug: full path --> filename
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   123
        return val.replace('\\\\','\\').replace('\\"','"')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   124
    return val
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   125
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   126
def parse_options_header(header, options=None):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   127
    if ';' not in header:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   128
        return header.lower().strip(), {}
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   129
    ctype, tail = header.split(';', 1)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   130
    options = options or {}
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   131
    for match in _re_option.finditer(tail):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   132
        key = match.group(1).lower()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   133
        value = header_unquote(match.group(2), key=='filename')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   134
        options[key] = value
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   135
    return ctype, options
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   136
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   137
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   138
################################## Multipart ##################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   139
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   140
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   141
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   142
class MultipartError(ValueError): pass
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   143
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   144
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   145
class MultipartParser(object):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   146
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   147
    def __init__(self, stream, boundary, content_length=-1,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   148
                 disk_limit=2**30, mem_limit=2**20, memfile_limit=2**18,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   149
                 buffer_size=2**16, charset='latin1'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   150
        ''' Parse a multipart/form-data byte stream. This object is an iterator
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   151
            over the parts of the message.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   152
            
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   153
            :param stream: A file-like stream. Must implement ``.read(size)``.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   154
            :param boundary: The multipart boundary as a byte string.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   155
            :param content_length: The maximum number of bytes to read.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   156
        '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   157
        self.stream, self.boundary = stream, boundary
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   158
        self.content_length = content_length
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   159
        self.disk_limit = disk_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   160
        self.memfile_limit = memfile_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   161
        self.mem_limit = min(mem_limit, self.disk_limit)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   162
        self.buffer_size = min(buffer_size, self.mem_limit)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   163
        self.charset = charset
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   164
        if self.buffer_size - 6 < len(boundary): # "--boundary--\r\n"
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   165
            raise MultipartError('Boundary does not fit into buffer_size.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   166
        self._done = []
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   167
        self._part_iter = None
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   168
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   169
    def __iter__(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   170
        ''' Iterate over the parts of the multipart message. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   171
        if not self._part_iter:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   172
            self._part_iter = self._iterparse()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   173
        for part in self._done:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   174
            yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   175
        for part in self._part_iter:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   176
            self._done.append(part)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   177
            yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   178
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   179
    def parts(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   180
        ''' Returns a list with all parts of the multipart message. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   181
        return list(iter(self))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   182
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   183
    def get(self, name, default=None):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   184
        ''' Return the first part with that name or a default value (None). '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   185
        for part in self:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   186
            if name == part.name:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   187
                return part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   188
        return default
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   189
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   190
    def get_all(self, name):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   191
        ''' Return a list of parts with that name. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   192
        return [p for p in self if p.name == name]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   193
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   194
    def _lineiter(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   195
        ''' Iterate over a binary file-like object line by line. Each line is
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   196
            returned as a (line, line_ending) tuple. If the line does not fit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   197
            into self.buffer_size, line_ending is empty and the rest of the line
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   198
            is returned with the next iteration.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   199
        '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   200
        read = self.stream.read
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   201
        maxread, maxbuf = self.content_length, self.buffer_size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   202
        _bcrnl = tob('\r\n')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   203
        _bcr = _bcrnl[:1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   204
        _bnl = _bcrnl[1:]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   205
        _bempty = _bcrnl[:0] # b'rn'[:0] -> b''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   206
        buffer = _bempty # buffer for the last (partial) line
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   207
        while 1:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   208
            data = read(maxbuf if maxread < 0 else min(maxbuf, maxread))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   209
            maxread -= len(data)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   210
            lines = (buffer+data).splitlines(True)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   211
            len_first_line = len(lines[0])
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   212
            # be sure that the first line does not become too big
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   213
            if len_first_line > self.buffer_size:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   214
                # at the same time don't split a '\r\n' accidentally
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   215
                if (len_first_line == self.buffer_size+1 and
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   216
                    lines[0].endswith(_bcrnl)):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   217
                    splitpos = self.buffer_size - 1
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   218
                else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   219
                    splitpos = self.buffer_size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   220
                lines[:1] = [lines[0][:splitpos],
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   221
                             lines[0][splitpos:]]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   222
            if data:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   223
                buffer = lines[-1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   224
                lines = lines[:-1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   225
            for line in lines:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   226
                if line.endswith(_bcrnl): yield line[:-2], _bcrnl
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   227
                elif line.endswith(_bnl): yield line[:-1], _bnl
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   228
                elif line.endswith(_bcr): yield line[:-1], _bcr
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   229
                else:                     yield line, _bempty
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   230
            if not data:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   231
                break
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   232
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   233
    def _iterparse(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   234
        lines, line = self._lineiter(), ''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   235
        separator = tob('--') + tob(self.boundary)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   236
        terminator = tob('--') + tob(self.boundary) + tob('--')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   237
        # Consume first boundary. Ignore leading blank lines
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   238
        for line, nl in lines:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   239
            if line: break
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   240
        if line != separator:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   241
            raise MultipartError("Stream does not start with boundary")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   242
        # For each part in stream...
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   243
        mem_used, disk_used = 0, 0 # Track used resources to prevent DoS
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   244
        is_tail = False # True if the last line was incomplete (cutted)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   245
        opts = {'buffer_size': self.buffer_size,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   246
                'memfile_limit': self.memfile_limit,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   247
                'charset': self.charset}
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   248
        part = MultipartPart(**opts)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   249
        for line, nl in lines:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   250
            if line == terminator and not is_tail:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   251
                part.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   252
                yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   253
                break
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   254
            elif line == separator and not is_tail:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   255
                if part.is_buffered(): mem_used  += part.size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   256
                else:                  disk_used += part.size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   257
                part.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   258
                yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   259
                part = MultipartPart(**opts)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   260
            else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   261
                is_tail = not nl # The next line continues this one
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   262
                part.feed(line, nl)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   263
                if part.is_buffered():
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   264
                    if part.size + mem_used > self.mem_limit:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   265
                        raise MultipartError("Memory limit reached.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   266
                elif part.size + disk_used > self.disk_limit:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   267
                    raise MultipartError("Disk limit reached.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   268
        if line != terminator:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   269
            raise MultipartError("Unexpected end of multipart stream.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   270
            
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   271
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   272
class MultipartPart(object):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   273
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   274
    def __init__(self, buffer_size=2**16, memfile_limit=2**18, charset='latin1'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   275
        self.headerlist = []
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   276
        self.headers = None
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   277
        self.file = False
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   278
        self.size = 0
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   279
        self._buf = tob('')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   280
        self.disposition, self.name, self.filename = None, None, None
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   281
        self.content_type, self.charset = None, charset
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   282
        self.memfile_limit = memfile_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   283
        self.buffer_size = buffer_size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   284
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   285
    def feed(self, line, nl=''):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   286
        if self.file:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   287
            return self.write_body(line, nl)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   288
        return self.write_header(line, nl)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   289
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   290
    def write_header(self, line, nl):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   291
        line = line.decode(self.charset or 'latin1')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   292
        if not nl: raise MultipartError('Unexpected end of line in header.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   293
        if not line.strip(): # blank line -> end of header segment
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   294
            self.finish_header()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   295
        elif line[0] in ' \t' and self.headerlist:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   296
            name, value = self.headerlist.pop()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   297
            self.headerlist.append((name, value+line.strip()))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   298
        else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   299
            if ':' not in line:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   300
                raise MultipartError("Syntax error in header: No colon.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   301
            name, value = line.split(':', 1)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   302
            self.headerlist.append((name.strip(), value.strip()))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   303
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   304
    def write_body(self, line, nl):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   305
        if not line and not nl: return # This does not even flush the buffer
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   306
        self.size += len(line) + len(self._buf)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   307
        self.file.write(self._buf + line)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   308
        self._buf = nl
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   309
        if self.content_length > 0 and self.size > self.content_length:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   310
            raise MultipartError('Size of body exceeds Content-Length header.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   311
        if self.size > self.memfile_limit and isinstance(self.file, BytesIO):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   312
            # TODO: What about non-file uploads that exceed the memfile_limit?
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   313
            self.file, old = TemporaryFile(mode='w+b'), self.file
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   314
            old.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   315
            copy_file(old, self.file, self.size, self.buffer_size)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   316
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   317
    def finish_header(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   318
        self.file = BytesIO()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   319
        self.headers = Headers(self.headerlist)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   320
        cdis = self.headers.get('Content-Disposition','')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   321
        ctype = self.headers.get('Content-Type','')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   322
        clen = self.headers.get('Content-Length','-1')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   323
        if not cdis:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   324
            raise MultipartError('Content-Disposition header is missing.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   325
        self.disposition, self.options = parse_options_header(cdis)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   326
        self.name = self.options.get('name')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   327
        self.filename = self.options.get('filename')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   328
        self.content_type, options = parse_options_header(ctype)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   329
        self.charset = options.get('charset') or self.charset
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   330
        self.content_length = int(self.headers.get('Content-Length','-1'))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   331
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   332
    def is_buffered(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   333
        ''' Return true if the data is fully buffered in memory.'''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   334
        return isinstance(self.file, BytesIO)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   335
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   336
    @property
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   337
    def value(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   338
        ''' Data decoded with the specified charset '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   339
        pos = self.file.tell()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   340
        self.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   341
        val = self.file.read()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   342
        self.file.seek(pos)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   343
        return val.decode(self.charset)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   344
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   345
    def save_as(self, path):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   346
        fp = open(path, 'wb')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   347
        pos = self.file.tell()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   348
        try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   349
            self.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   350
            size = copy_file(self.file, fp)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   351
        finally:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   352
            self.file.seek(pos)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   353
        return size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   354
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   355
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   356
#################################### WSGI ####################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   357
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   358
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   359
def parse_form_data(environ, charset='utf8', strict=False, **kw):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   360
    ''' Parse form data from an environ dict and return a (forms, files) tuple.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   361
        Both tuple values are dictionaries with the form-field name as a key
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   362
        (unicode) and lists as values (multiple values per key are possible).
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   363
        The forms-dictionary contains form-field values as unicode strings.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   364
        The files-dictionary contains :class:`MultipartPart` instances, either
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   365
        because the form-field was a file-upload or the value is to big to fit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   366
        into memory limits.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   367
        
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   368
        :param environ: An WSGI environment dict.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   369
        :param charset: The charset to use if unsure. (default: utf8)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   370
        :param strict: If True, raise :exc:`MultipartError` on any parsing
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   371
                       errors. These are silently ignored by default.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   372
    '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   373
        
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   374
    forms, files = MultiDict(), MultiDict()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   375
    try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   376
        if environ.get('REQUEST_METHOD','GET').upper() not in ('POST', 'PUT'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   377
            raise MultipartError("Request method other than POST or PUT.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   378
        content_length = int(environ.get('CONTENT_LENGTH', '-1'))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   379
        content_type = environ.get('CONTENT_TYPE', '')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   380
        if not content_type:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   381
            raise MultipartError("Missing Content-Type header.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   382
        content_type, options = parse_options_header(content_type)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   383
        stream = environ.get('wsgi.input') or BytesIO()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   384
        kw['charset'] = charset = options.get('charset', charset)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   385
        if content_type == 'multipart/form-data':
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   386
            boundary = options.get('boundary','')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   387
            if not boundary:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   388
                raise MultipartError("No boundary for multipart/form-data.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   389
            for part in MultipartParser(stream, boundary, content_length, **kw):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   390
                if part.filename or not part.is_buffered():
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   391
                    files[part.name] = part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   392
                else: # TODO: Big form-fields are in the files dict. really?
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   393
                    forms[part.name] = part.value
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   394
        elif content_type in ('application/x-www-form-urlencoded',
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   395
                              'application/x-url-encoded'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   396
            mem_limit = kw.get('mem_limit', 2**20)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   397
            if content_length > mem_limit:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   398
                raise MultipartError("Request to big. Increase MAXMEM.")
9946
ec88c1a1904a [wsgi] Fix unicode decoding in POST
Christophe de Vienne <christophe@unlish.com>
parents: 9735
diff changeset
   399
            data = stream.read(mem_limit)
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   400
            if stream.read(1): # These is more that does not fit mem_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   401
                raise MultipartError("Request to big. Increase MAXMEM.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   402
            data = parse_qs(data, keep_blank_values=True)
10662
10942ed172de [py3k] dict.iteritems → dict.items
Rémi Cardona <remi.cardona@logilab.fr>
parents: 10603
diff changeset
   403
            for key, values in data.items():
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   404
                for value in values:
9946
ec88c1a1904a [wsgi] Fix unicode decoding in POST
Christophe de Vienne <christophe@unlish.com>
parents: 9735
diff changeset
   405
                    forms[key] = value.decode(charset)
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   406
        else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   407
            raise MultipartError("Unsupported content type.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   408
    except MultipartError:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   409
        if strict: raise
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   410
    return forms, files
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   411