Interesting idea. Columnar ETL can be quite efficient in some scenarios because frequently an ETL transformation (e.g. calculating a new column) effectively modifies an existing table, rather than creates a new one. This allows calculating only the delta, instead of re-building a new table from. This helps optimize performance and do calculations in-memory without slow disk I/O.
Another advantage is that it allows performing many transformations (e.g. filtering) directly on dictionary compressed data, without decompressing it. This works well in Vertica [1] (based on C-Store DB [2]) which was our inspiration for building a light-weight ETL for business users that also uses a columnar in-memory data transformation engine [3].
Another advantage is that it allows performing many transformations (e.g. filtering) directly on dictionary compressed data, without decompressing it. This works well in Vertica [1] (based on C-Store DB [2]) which was our inspiration for building a light-weight ETL for business users that also uses a columnar in-memory data transformation engine [3].
[1] https://www.vertica.com/
[2] http://db.csail.mit.edu/projects/cstore/
[3] http://easymorph.com/in-memory-engine.html