multipart.py
author Alain Leufroy <alain.leufroy@logilab.fr>
Tue, 03 Jun 2014 12:37:54 +0200
changeset 9886 1b26289217a3
parent 9735 b71158815bc8
child 9946 ec88c1a1904a
permissions -rw-r--r--
[devtools] improve error message when postgresql tools are missing By default in at least Debian, some pg tools are not present in the PATH. Or they may not be installed. But the tests tools expects them to be in the PATH, and give an unhelpful 'No such file or directory' backtrace if they're not found. To help devs using the pg tests we improve the error message.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
9735
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     1
# -*- coding: utf-8 -*-
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     2
'''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     3
Parser for multipart/form-data
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     4
==============================
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     5
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     6
This module provides a parser for the multipart/form-data format. It can read
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     7
from a file, a socket or a WSGI environment. The parser can be used to replace
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     8
cgi.FieldStorage (without the bugs) and works with Python 2.5+ and 3.x (2to3).
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
     9
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    10
Licence (MIT)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    11
-------------
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    12
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    13
    Copyright (c) 2010, Marcel Hellkamp.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    14
    Inspired by the Werkzeug library: http://werkzeug.pocoo.org/
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    15
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    16
    Permission is hereby granted, free of charge, to any person obtaining a copy
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    17
    of this software and associated documentation files (the "Software"), to deal
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    18
    in the Software without restriction, including without limitation the rights
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    19
    to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    20
    copies of the Software, and to permit persons to whom the Software is
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    21
    furnished to do so, subject to the following conditions:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    22
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    23
    The above copyright notice and this permission notice shall be included in
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    24
    all copies or substantial portions of the Software.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    25
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    26
    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    27
    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    28
    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    29
    AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    30
    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    31
    OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    32
    THE SOFTWARE.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    33
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    34
'''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    35
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    36
__author__ = 'Marcel Hellkamp'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    37
__version__ = '0.1'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    38
__license__ = 'MIT'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    39
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    40
from tempfile import TemporaryFile
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    41
from wsgiref.headers import Headers
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    42
import re, sys
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    43
try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    44
    from urlparse import parse_qs
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    45
except ImportError: # pragma: no cover (fallback for Python 2.5)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    46
    from cgi import parse_qs
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    47
try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    48
    from io import BytesIO
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    49
except ImportError: # pragma: no cover (fallback for Python 2.5)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    50
    from StringIO import StringIO as BytesIO
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    51
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    52
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    53
################################ Helper & Misc ################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    54
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    55
# Some of these were copied from bottle: http://bottle.paws.de/
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    56
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    57
try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    58
    from collections import MutableMapping as DictMixin
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    59
except ImportError: # pragma: no cover (fallback for Python 2.5)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    60
    from UserDict import DictMixin
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    61
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    62
class MultiDict(DictMixin):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    63
    """ A dict that remembers old values for each key """
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    64
    def __init__(self, *a, **k):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    65
        self.dict = dict()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    66
        for k, v in dict(*a, **k).iteritems():
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    67
            self[k] = v
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    68
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    69
    def __len__(self): return len(self.dict)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    70
    def __iter__(self): return iter(self.dict)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    71
    def __contains__(self, key): return key in self.dict
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    72
    def __delitem__(self, key): del self.dict[key]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    73
    def keys(self): return self.dict.keys()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    74
    def __getitem__(self, key): return self.get(key, KeyError, -1)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    75
    def __setitem__(self, key, value): self.append(key, value)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    76
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    77
    def append(self, key, value): self.dict.setdefault(key, []).append(value)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    78
    def replace(self, key, value): self.dict[key] = [value]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    79
    def getall(self, key): return self.dict.get(key) or []
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    80
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    81
    def get(self, key, default=None, index=-1):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    82
        if key not in self.dict and default != KeyError:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    83
            return [default][index]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    84
        return self.dict[key][index]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    85
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    86
    def iterallitems(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    87
        for key, values in self.dict.iteritems():
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    88
            for value in values:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    89
                yield key, value
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    90
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    91
def tob(data, enc='utf8'): # Convert strings to bytes (py2 and py3)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    92
    return data.encode(enc) if isinstance(data, unicode) else data
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    93
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    94
def copy_file(stream, target, maxread=-1, buffer_size=2*16):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    95
    ''' Read from :stream and write to :target until :maxread or EOF. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    96
    size, read = 0, stream.read
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    97
    while 1:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    98
        to_read = buffer_size if maxread < 0 else min(buffer_size, maxread-size)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
    99
        part = read(to_read)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   100
        if not part: return size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   101
        target.write(part)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   102
        size += len(part)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   103
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   104
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   105
################################ Header Parser ################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   106
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   107
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   108
_special = re.escape('()<>@,;:\\"/[]?={} \t')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   109
_re_special = re.compile('[%s]' % _special)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   110
_qstr = '"(?:\\\\.|[^"])*"' # Quoted string
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   111
_value = '(?:[^%s]+|%s)' % (_special, _qstr) # Save or quoted string
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   112
_option = '(?:;|^)\s*([^%s]+)\s*=\s*(%s)' % (_special, _value)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   113
_re_option = re.compile(_option) # key=value part of an Content-Type like header
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   114
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   115
def header_quote(val):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   116
    if not _re_special.search(val):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   117
        return val
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   118
    return '"' + val.replace('\\','\\\\').replace('"','\\"') + '"'
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   119
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   120
def header_unquote(val, filename=False):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   121
    if val[0] == val[-1] == '"':
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   122
        val = val[1:-1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   123
        if val[1:3] == ':\\' or val[:2] == '\\\\': 
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   124
            val = val.split('\\')[-1] # fix ie6 bug: full path --> filename
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   125
        return val.replace('\\\\','\\').replace('\\"','"')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   126
    return val
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   127
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   128
def parse_options_header(header, options=None):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   129
    if ';' not in header:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   130
        return header.lower().strip(), {}
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   131
    ctype, tail = header.split(';', 1)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   132
    options = options or {}
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   133
    for match in _re_option.finditer(tail):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   134
        key = match.group(1).lower()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   135
        value = header_unquote(match.group(2), key=='filename')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   136
        options[key] = value
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   137
    return ctype, options
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   138
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   139
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   140
################################## Multipart ##################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   141
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   142
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   143
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   144
class MultipartError(ValueError): pass
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   145
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   146
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   147
class MultipartParser(object):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   148
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   149
    def __init__(self, stream, boundary, content_length=-1,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   150
                 disk_limit=2**30, mem_limit=2**20, memfile_limit=2**18,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   151
                 buffer_size=2**16, charset='latin1'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   152
        ''' Parse a multipart/form-data byte stream. This object is an iterator
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   153
            over the parts of the message.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   154
            
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   155
            :param stream: A file-like stream. Must implement ``.read(size)``.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   156
            :param boundary: The multipart boundary as a byte string.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   157
            :param content_length: The maximum number of bytes to read.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   158
        '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   159
        self.stream, self.boundary = stream, boundary
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   160
        self.content_length = content_length
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   161
        self.disk_limit = disk_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   162
        self.memfile_limit = memfile_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   163
        self.mem_limit = min(mem_limit, self.disk_limit)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   164
        self.buffer_size = min(buffer_size, self.mem_limit)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   165
        self.charset = charset
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   166
        if self.buffer_size - 6 < len(boundary): # "--boundary--\r\n"
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   167
            raise MultipartError('Boundary does not fit into buffer_size.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   168
        self._done = []
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   169
        self._part_iter = None
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   170
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   171
    def __iter__(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   172
        ''' Iterate over the parts of the multipart message. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   173
        if not self._part_iter:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   174
            self._part_iter = self._iterparse()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   175
        for part in self._done:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   176
            yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   177
        for part in self._part_iter:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   178
            self._done.append(part)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   179
            yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   180
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   181
    def parts(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   182
        ''' Returns a list with all parts of the multipart message. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   183
        return list(iter(self))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   184
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   185
    def get(self, name, default=None):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   186
        ''' Return the first part with that name or a default value (None). '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   187
        for part in self:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   188
            if name == part.name:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   189
                return part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   190
        return default
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   191
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   192
    def get_all(self, name):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   193
        ''' Return a list of parts with that name. '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   194
        return [p for p in self if p.name == name]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   195
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   196
    def _lineiter(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   197
        ''' Iterate over a binary file-like object line by line. Each line is
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   198
            returned as a (line, line_ending) tuple. If the line does not fit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   199
            into self.buffer_size, line_ending is empty and the rest of the line
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   200
            is returned with the next iteration.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   201
        '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   202
        read = self.stream.read
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   203
        maxread, maxbuf = self.content_length, self.buffer_size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   204
        _bcrnl = tob('\r\n')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   205
        _bcr = _bcrnl[:1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   206
        _bnl = _bcrnl[1:]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   207
        _bempty = _bcrnl[:0] # b'rn'[:0] -> b''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   208
        buffer = _bempty # buffer for the last (partial) line
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   209
        while 1:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   210
            data = read(maxbuf if maxread < 0 else min(maxbuf, maxread))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   211
            maxread -= len(data)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   212
            lines = (buffer+data).splitlines(True)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   213
            len_first_line = len(lines[0])
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   214
            # be sure that the first line does not become too big
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   215
            if len_first_line > self.buffer_size:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   216
                # at the same time don't split a '\r\n' accidentally
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   217
                if (len_first_line == self.buffer_size+1 and
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   218
                    lines[0].endswith(_bcrnl)):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   219
                    splitpos = self.buffer_size - 1
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   220
                else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   221
                    splitpos = self.buffer_size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   222
                lines[:1] = [lines[0][:splitpos],
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   223
                             lines[0][splitpos:]]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   224
            if data:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   225
                buffer = lines[-1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   226
                lines = lines[:-1]
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   227
            for line in lines:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   228
                if line.endswith(_bcrnl): yield line[:-2], _bcrnl
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   229
                elif line.endswith(_bnl): yield line[:-1], _bnl
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   230
                elif line.endswith(_bcr): yield line[:-1], _bcr
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   231
                else:                     yield line, _bempty
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   232
            if not data:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   233
                break
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   234
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   235
    def _iterparse(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   236
        lines, line = self._lineiter(), ''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   237
        separator = tob('--') + tob(self.boundary)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   238
        terminator = tob('--') + tob(self.boundary) + tob('--')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   239
        # Consume first boundary. Ignore leading blank lines
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   240
        for line, nl in lines:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   241
            if line: break
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   242
        if line != separator:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   243
            raise MultipartError("Stream does not start with boundary")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   244
        # For each part in stream...
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   245
        mem_used, disk_used = 0, 0 # Track used resources to prevent DoS
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   246
        is_tail = False # True if the last line was incomplete (cutted)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   247
        opts = {'buffer_size': self.buffer_size,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   248
                'memfile_limit': self.memfile_limit,
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   249
                'charset': self.charset}
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   250
        part = MultipartPart(**opts)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   251
        for line, nl in lines:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   252
            if line == terminator and not is_tail:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   253
                part.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   254
                yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   255
                break
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   256
            elif line == separator and not is_tail:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   257
                if part.is_buffered(): mem_used  += part.size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   258
                else:                  disk_used += part.size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   259
                part.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   260
                yield part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   261
                part = MultipartPart(**opts)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   262
            else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   263
                is_tail = not nl # The next line continues this one
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   264
                part.feed(line, nl)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   265
                if part.is_buffered():
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   266
                    if part.size + mem_used > self.mem_limit:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   267
                        raise MultipartError("Memory limit reached.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   268
                elif part.size + disk_used > self.disk_limit:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   269
                    raise MultipartError("Disk limit reached.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   270
        if line != terminator:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   271
            raise MultipartError("Unexpected end of multipart stream.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   272
            
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   273
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   274
class MultipartPart(object):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   275
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   276
    def __init__(self, buffer_size=2**16, memfile_limit=2**18, charset='latin1'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   277
        self.headerlist = []
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   278
        self.headers = None
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   279
        self.file = False
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   280
        self.size = 0
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   281
        self._buf = tob('')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   282
        self.disposition, self.name, self.filename = None, None, None
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   283
        self.content_type, self.charset = None, charset
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   284
        self.memfile_limit = memfile_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   285
        self.buffer_size = buffer_size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   286
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   287
    def feed(self, line, nl=''):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   288
        if self.file:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   289
            return self.write_body(line, nl)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   290
        return self.write_header(line, nl)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   291
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   292
    def write_header(self, line, nl):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   293
        line = line.decode(self.charset or 'latin1')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   294
        if not nl: raise MultipartError('Unexpected end of line in header.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   295
        if not line.strip(): # blank line -> end of header segment
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   296
            self.finish_header()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   297
        elif line[0] in ' \t' and self.headerlist:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   298
            name, value = self.headerlist.pop()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   299
            self.headerlist.append((name, value+line.strip()))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   300
        else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   301
            if ':' not in line:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   302
                raise MultipartError("Syntax error in header: No colon.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   303
            name, value = line.split(':', 1)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   304
            self.headerlist.append((name.strip(), value.strip()))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   305
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   306
    def write_body(self, line, nl):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   307
        if not line and not nl: return # This does not even flush the buffer
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   308
        self.size += len(line) + len(self._buf)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   309
        self.file.write(self._buf + line)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   310
        self._buf = nl
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   311
        if self.content_length > 0 and self.size > self.content_length:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   312
            raise MultipartError('Size of body exceeds Content-Length header.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   313
        if self.size > self.memfile_limit and isinstance(self.file, BytesIO):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   314
            # TODO: What about non-file uploads that exceed the memfile_limit?
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   315
            self.file, old = TemporaryFile(mode='w+b'), self.file
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   316
            old.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   317
            copy_file(old, self.file, self.size, self.buffer_size)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   318
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   319
    def finish_header(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   320
        self.file = BytesIO()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   321
        self.headers = Headers(self.headerlist)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   322
        cdis = self.headers.get('Content-Disposition','')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   323
        ctype = self.headers.get('Content-Type','')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   324
        clen = self.headers.get('Content-Length','-1')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   325
        if not cdis:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   326
            raise MultipartError('Content-Disposition header is missing.')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   327
        self.disposition, self.options = parse_options_header(cdis)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   328
        self.name = self.options.get('name')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   329
        self.filename = self.options.get('filename')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   330
        self.content_type, options = parse_options_header(ctype)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   331
        self.charset = options.get('charset') or self.charset
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   332
        self.content_length = int(self.headers.get('Content-Length','-1'))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   333
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   334
    def is_buffered(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   335
        ''' Return true if the data is fully buffered in memory.'''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   336
        return isinstance(self.file, BytesIO)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   337
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   338
    @property
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   339
    def value(self):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   340
        ''' Data decoded with the specified charset '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   341
        pos = self.file.tell()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   342
        self.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   343
        val = self.file.read()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   344
        self.file.seek(pos)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   345
        return val.decode(self.charset)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   346
    
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   347
    def save_as(self, path):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   348
        fp = open(path, 'wb')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   349
        pos = self.file.tell()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   350
        try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   351
            self.file.seek(0)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   352
            size = copy_file(self.file, fp)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   353
        finally:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   354
            self.file.seek(pos)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   355
        return size
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   356
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   357
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   358
#################################### WSGI ####################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   359
##############################################################################
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   360
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   361
def parse_form_data(environ, charset='utf8', strict=False, **kw):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   362
    ''' Parse form data from an environ dict and return a (forms, files) tuple.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   363
        Both tuple values are dictionaries with the form-field name as a key
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   364
        (unicode) and lists as values (multiple values per key are possible).
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   365
        The forms-dictionary contains form-field values as unicode strings.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   366
        The files-dictionary contains :class:`MultipartPart` instances, either
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   367
        because the form-field was a file-upload or the value is to big to fit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   368
        into memory limits.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   369
        
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   370
        :param environ: An WSGI environment dict.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   371
        :param charset: The charset to use if unsure. (default: utf8)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   372
        :param strict: If True, raise :exc:`MultipartError` on any parsing
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   373
                       errors. These are silently ignored by default.
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   374
    '''
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   375
        
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   376
    forms, files = MultiDict(), MultiDict()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   377
    try:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   378
        if environ.get('REQUEST_METHOD','GET').upper() not in ('POST', 'PUT'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   379
            raise MultipartError("Request method other than POST or PUT.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   380
        content_length = int(environ.get('CONTENT_LENGTH', '-1'))
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   381
        content_type = environ.get('CONTENT_TYPE', '')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   382
        if not content_type:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   383
            raise MultipartError("Missing Content-Type header.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   384
        content_type, options = parse_options_header(content_type)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   385
        stream = environ.get('wsgi.input') or BytesIO()
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   386
        kw['charset'] = charset = options.get('charset', charset)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   387
        if content_type == 'multipart/form-data':
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   388
            boundary = options.get('boundary','')
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   389
            if not boundary:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   390
                raise MultipartError("No boundary for multipart/form-data.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   391
            for part in MultipartParser(stream, boundary, content_length, **kw):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   392
                if part.filename or not part.is_buffered():
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   393
                    files[part.name] = part
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   394
                else: # TODO: Big form-fields are in the files dict. really?
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   395
                    forms[part.name] = part.value
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   396
        elif content_type in ('application/x-www-form-urlencoded',
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   397
                              'application/x-url-encoded'):
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   398
            mem_limit = kw.get('mem_limit', 2**20)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   399
            if content_length > mem_limit:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   400
                raise MultipartError("Request to big. Increase MAXMEM.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   401
            data = stream.read(mem_limit).decode(charset)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   402
            if stream.read(1): # These is more that does not fit mem_limit
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   403
                raise MultipartError("Request to big. Increase MAXMEM.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   404
            data = parse_qs(data, keep_blank_values=True)
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   405
            for key, values in data.iteritems():
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   406
                for value in values:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   407
                    forms[key] = value
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   408
        else:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   409
            raise MultipartError("Unsupported content type.")
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   410
    except MultipartError:
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   411
        if strict: raise
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   412
    return forms, files
b71158815bc8 [wsgi] avoid reading the entire request body in memory
Julien Cristau <julien.cristau@logilab.fr>
parents:
diff changeset
   413