Hi,

This is a post that I believe can be helpful to fellow Quants here.

Many of you certainly use Lists in their Python algos. It's all fine. However, while working on a new data structure for my own Python algos, I realized something that I wanted to share:

When you slice Lists only to read the result (the slice), it is MUCH MORE efficient to use itertools.islice() instead of usual Python slicing, such as my_list[start:stop:step]. This is because slicing in Python creates a new List with the sliced elements in it. Indeed, a new List must be created (which takes longer than most people think) and the elements of the slice must be copied to the newly created List, which is then returned (this is only the references to the objects - so a shallow copy, not a deep copy - which is not too long). However, with itertools.islice(), an iterator over the slice is returned, and thus iteration is done on the original list, not a copy. This is very very fast (even if it is in O(n) - islice() only has to skip and count elements until it reaches your start index). Of course, as stated at the beginning of this paragraph, if you create the slice on purpose, because you need to work on a copy, for instance, if you want to alter the slice without altering the original List, then, slicing is the correct option. But if you slice only to read the resulting data in the slice, use itertools.islice() instead.

To give some examples, I am talking about doing this:

from itertools import islice
#Here, s is an iterator over the specified fragment of the original list 'l'
s = islice(l, start, stop, step)
for i in s:
    #Do something
    #You cannot write to s, doing something like:
    #s[index] = ...
    #because s is an iterator, not a List.

Instead of this:

s = l[start:stop:step]
for i in s:
    #Do something.
    #But in your algo, you are not writing to s / changing its contents,
    #like with s[index] = ...
    #You are only slicing to read the slice.


Here are some profiling results for convincing the doubtful:

Wrote profile results to test15.py.lprof
Timer unit: 1e-06 s

Total time: 84.6607 s
File: test15.py
Function: f1 at line 7

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     7                                           @profile
     8                                           def f1():
     9   1000001     506510.9      0.5      0.6      for _ in range(1_000_000):
    10   1000000    5016767.1      5.0      5.9          i = random.randint(0, 49_999)
    11   1000000    4095157.0      4.1      4.8          j = random.randint(0, 49_999)
    12                                                   #Here, s is a List, a copy of the specified fragment of the original list
    13   1000000   75042277.5     75.0     88.6          s = l[min(i, j):max(i, j)]

Total time: 10.2587 s
File: test15.py
Function: f2 at line 15

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    15                                           @profile
    16                                           def f2():
    17   1000001     381841.1      0.4      3.7      for _ in range(1_000_000):
    18   1000000    4451683.4      4.5     43.4          i = random.randint(0, 49_999)
    19   1000000    4424653.7      4.4     43.1          j = random.randint(0, 49_999)
    20                                                   #Here, s is an iterator over the specified fragment of the original list
    21   1000000    1000567.7      1.0      9.8          s = islice(l, min(i, j), max(i, j))

1 microsecond per islice() call vs 75 microseconds per slice.

Please note:

1- itertools.islice() does not handle negative indexes. Here is a replacement that I coded that handles them:

def islice_neg(iterable, *args):
	start, stop, step = slice(*args).indices(len(iterable))
	return itertools.islice(iterable, start, stop, step)

You can include it in your algos. You use it just like islice() if you want to include negative indexes (starting from the end of the structure).

2- This does not only apply to Lists but any other Python data structures that you slice only to read the result… islice them, or islice_neg them, instead.

Fred