Is there any option I missed or any plan to implement peristent storage for own preprocessed/generated data in research environemnt (it would be great if it can be accesed from alogs too)?

In research phase there is quite common pattern, that you will preprocess data first (e.g. generate custom higher frequency data from minute data, or some  ML features..) and then work with them many times.

 

This preprocessing often takes long time (tens of minutes, hours). So in local environment it's good idea to serialize them (e.g. to csv, parquet, database) and work with them repeatedly.

I didn't see anything like this in Quantopian. 

It's quite waste of (computational and human) resources when you generate same data many times, so I believe it woudl be of great use for many users.

Something like encapsulated S3 like storage for pandas dataframes would be ideal (and cheap enough for reasonable amount of data, relatively to price of computational resources..)

Author