fbpx

Loop vs Map vs List Comprehension

As a Python developer I have seen lots of different preferences in this topic. These preferences aside, I was set to find which of them is faster in which situations.

TLDR;

If you require a list of results almost always use a list comprehension. If no results are required, using a simple loop is simpler to read and faster to run. Never use the builtin map, unless its more aesthetically appealing for that piece of code and your application does not need the speed improvement.

No Result Required

The first pattern I have seen is the use of Map, List Comprehension vs a standard loop in the case where there is no result required. Therefore the resulting list (returned by python’s map function or list comp) is ignored. For example, the following

for entry in entries:
    process(entry)

vs

map(process, entries)
[process(entry) for entry in entries]

The results are as follows, there were three variations of the test, the first is a single loop (O(n)), second is loop within a loop (O(n2)), and third is O(n3), three loops. The test code can be seen on github.

Result Required

This is the traditional pattern, where the generated list is useful. With this approach the following is how fast each approach is in python. For example

results = []
for entry in entries:
    results.append(process(entry))

vs

results = map(process, entries)
results = [process(entry) for entry in entries]

Same as the previous test, the three variations were tested, O(n), O(n2), and O(n3).

Conclusions

If you require a list of results almost always use a list comprehension. If no results are required, using a simple loop is simpler to read and faster to run. Never use the builtin map, unless its more aesthetically appealing for that piece of code and your application does not need the speed improvement.

7zip vs Gzip … compression and speed

Since we work with twitter, this comparison will be using twitter as a data-source. We have randomly selected 5k users and used their public tweets for this test.

Setup

The setup is not extremely important as we are more concerned with the ratio’s for our conclusions, for the sake of completeness the tests were run in python 2.7.3 (gcc 4.7.2) with the latest version of pylzma from pypi (0.4.4) and gzip from the standard python library. The hardware configuration is CPU E5-2620 @ 2.00GHz and the software was allowed to use the full capacity of this hardware.

Data

As mentioned a random sample of 5k users were selected for this experiment. The following are some stats around the data.

Users:                      5,000
Avg # of Tweets:            1433
Std Dev:                    1018
Avg uncompressed size:      1228518 bytes (1.2Mb)
Std Dev size:               874973 bytes (0.8 Mb)

Compression

There is no doubt that Lzma compresses much smaller than zlib, in these tests we want to see exactly how much better the compression ratio actually is.

lzma avg size: 100286 (8.2%, 97kb)
zlib avg size: 142456 (11.6%, 139kb)

Speed

Lzma compresses better than BZ2 and faster, but it is well known that zlib compresses faster. Here is a comparison of the compression speed difference on our dataset.

lzma  3884.27s (776.8ms / user)
zlib   184.40s (36.9ms / user)

Note: lzma used 2x as much memory as the zlib test

Conclusion

With our dataset, lzma compressed down to an average of 8% of the dataset size, while zlib compressed to 12%. In measurable numbers, for 5,000 users tweets using lzma would save 200Mb, an average savings of 41kb per user. Regarding compression speed, using 7zip we spent 1 hour more, an average of 0.7s more spent per user.

To put things into perspective, if we are processing 1 million users, gzip would compress 9 days sooner, but have an extra overhead of 40 Gb.

Neither the compression speed nor the size are really negligible in this case, so depending on your specific needs you may pick one over the other. Generally though, for many people / usecases disk space is usually not a concern as much as speed.

Python Memory Footprint

This article applies to python 2.7 64-bit (32bit and py3k may be different)

Edit: I added simpler estimate formulas along side the actual formulas, so the python memory footprint can be quickly visualized.

Edit2: 32 bit python seems to be using around half of the memory; this seems to be due to the use of 32bit pointers instead of 64. That being said, you are limited to 2GB of memory.

Some developers are unaware of the memory footprint python has and tend to hit walls especially if they are trying to load big data into memory instead of using efficient cache oblivious algorithms and data-structures.

This post demonstrates the memory footprint of basic python objects/data-structures. You can use this data to estimate how much memory you would need to support your program or better layout your program if your memory starts to run out. This data was collected using the python profiler Guppy-PE.

  1. Boolean and Numerical Types
  2. Strings
  3. DataStructures (Lists, Tuples, Dict)

Boolean and Numerical Types

  • Boolean (bool): 24 bytes
    import guppy
    hp = guppy.hpy()
    
    In : hp.iso(True)
    Out: Partition of a set of 1 object. Total size = 24 bytes.
     Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
         0      1 100       24 100        24 100 bool
    
  • Integers (int): 24 bytes
    In : hp.iso(1)
    Out: Partition of a set of 1 object. Total size = 24 bytes.
    Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    0      1 100       24 100        24 100 int
    
    In : hp.iso(2**62)
    Out: Partition of a set of 1 object. Total size = 24 bytes.
     Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
         0      1 100       24 100        24 100 int
  • Long Integers (long): 32 + int(math.log(NUMBER, 2) / 60) * 8 bytes
    In : hp.iso(long(0))
    Out: Partition of a set of 1 object. Total size = 24 bytes.
     Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
         0      1 100       24 100        24 100 long
    
    In : hp.iso(long(2**60))
    Out: Partition of a set of 1 object. Total size = 40 bytes.
     Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
         0      1 100       40 100        40 100 long
    
    In : hp.iso(2**120)
    Out: Partition of a set of 1 object. Total size = 48 bytes.
     Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
         0      1 100       48 100        48 100 long
    
    In : hp.iso(2**180)
    Out: Partition of a set of 1 object. Total size = 56 bytes.
     Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
         0      1 100       56 100        56 100 long
    
    # 32 + int(math.log(abs(0) or 1, 2) / 60) * 8
  • Float: 24 bytes
    In : hp.iso(1.0)
    Out: Partition of a set of 1 object. Total size = 24 bytes.
    Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    0      1 100       24 100        24 100 float
    
    In : hp.iso(128301289308129083901231.09102783098192083091823089120839012)
    Out: Partition of a set of 1 object. Total size = 24 bytes.
    Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    0      1 100       24 100        24 100 float
  • Decimal (decimal.Decimal): 40 bytes
    In : hp.iso(decimal.Decimal('1.0'))
    Out: Partition of a set of 1 object. Total size = 80 bytes.
    Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    0      1 100       80 100        80 100 decimal.Decimal
    
    In : hp.iso(decimal.Decimal('128301289308129083901231.09102783098192083091823089120839012'))
    Out: Partition of a set of 1 object. Total size = 80 bytes.
    Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    0      1 100       80 100        80 100 decimal.Decimal

Strings

Note: In python 2 string concat using the __add__ or essentially ‘+’ creates intermediate strings which will essentially grab much more memory than you need. The efficient way to join strings is to use the string join method or ‘%s’ string formatting (for example). Just avoid use of ‘+’ with strings until you move to python 3.

Every 8 chars use 8 bytes, with an initial 40 bytes (for up to 3 chars)

  • String: 40 + ((len(s) – 4) / 8 + 1) * 8 bytes ~= 40 + len(s) * 8
    In : hp.iso('a'*3)
    Out: Partition of a set of 1 object. Total size = 40 bytes.
     Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
         0      1 100       40 100        40 100 str
    
    In : hp.iso('a'*4)
    Out: Partition of a set of 1 object. Total size = 48 bytes.
     Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
         0      1 100       48 100        48 100 str
    
    In : hp.iso('a'*12)
    Out: Partition of a set of 1 object. Total size = 56 bytes.
     Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
         0      1 100       56 100        56 100 str
    
    In : hp.iso('a'*20)
    Out: Partition of a set of 1 object. Total size = 64 bytes.
     Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
         0      1 100       64 100        64 100 str
    

DataStructures (Lists, Tuples, Dict)

the following is just the structure’s memory usage and not whats inside of it:

  • Tuple: 56 + 8 * len(t) bytes
    In : hp.iso(tuple())
    Out: Partition of a set of 1 object. Total size = 56 bytes.
    Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    0      1 100       56 100        56 100 tuple
    
    In : hp.iso(tuple(range(1)))
    Out: Partition of a set of 1 object. Total size = 64 bytes.
    Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    0      1 100       64 100        64 100 tuple
    
    In : hp.iso(tuple(range(2)))
    Out: Partition of a set of 1 object. Total size = 72 bytes.
    Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    0      1 100       72 100        72 100 tuple
    
    In : hp.iso(tuple(range(100)))
    Out: Partition of a set of 1 object. Total size = 856 bytes.
    Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    0      1 100      856 100       856 100 tuple
    
  • List: 72 + 64 * int(1 + (len(l) + 1) / 8) bytes ~= 72 + len(l) * 8
    In : hp.iso(list())
    Out: Partition of a set of 1 object. Total size = 72 bytes.
    Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    0      1 100       72 100        72 100 list
    
    In : hp.iso(list(range(1)))
    Out: Partition of a set of 1 object. Total size = 136 bytes.
    Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    0      1 100      136 100       136 100 list
    
    In : hp.iso(list(range(8)))
    Out: Partition of a set of 1 object. Total size = 200 bytes.
    Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    0      1 100      200 100       200 100 list
    
    In : hp.iso(list(range(16)))
    Out: Partition of a set of 1 object. Total size = 264 bytes.
    Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    0      1 100      264 100       264 100 list
    
  • Dictionary (dict): memory based on number of buckets, below are the details, and here is the pattern that seems to be exhibited. the first 5 elements are included in the initial 280 bytes. The next bucket can hold up to (2**4) 16 more elements with 52.5 bytes per element. The next bucket can hold (2 ** 6) 64 more elements with 36 bytes per element. The next bucket can host (2 ** 8) 256 more elements with 36 bytes per element. The next can host (2** 10) 1024 more elements with 36 bytes per element … I have not tried to come up with a formula for this one, feel free to solve this in the comments.
    In : hp.iso(dict())
    In : hp.iso(dict([(x,None) for x in range(5)]))
    Out: Partition of a set of 1 object. Total size = 280 bytes.
    Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    0      1 100      280 100       280 100 dict (no owner)
    
    In : hp.iso(dict([(x,None) for x in range(6)]))
    In : hp.iso(dict([(x,None) for x in range(5 + 16)]))
    Out: Partition of a set of 1 object. Total size = 1048 bytes.
    Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    0      1 100     1048 100      1048 100 dict (no owner)
    
    In : hp.iso(dict([(x,None) for x in range(6 + 16)]))
    In : hp.iso(dict([(x,None) for x in range(5 + 16 + 64)]))
    Out: Partition of a set of 1 object. Total size = 3352 bytes.
     Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
         0      1 100     3352 100      3352 100 dict (no owner)
    
    In : hp.iso(dict([(x,None) for x in range(6 + 16 + 64)]))
    In : hp.iso(dict([(x,None) for x in range(5 + 16 + 64 + 128)]))
    Out: Partition of a set of 1 object. Total size = 12568 bytes.
    Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
    0      1 100    12568 100     12568 100 dict (no owner)
    
    In : hp.iso(dict([(x,None) for x in range(6 + 16 + 64 + 128)]))
    In : hp.iso(dict([(x,None) for x in range(5 + 16 + 64 + 128 + 1024)]))
    Out: Partition of a set of 1 object. Total size = 49432 bytes.
     Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
         0      1 100    49432 100     49432 100 dict (no owner)