hi,
I am not clear on how to calculate data sizing ; the documentation
(http://www.datastax.com/docs/1.2/cluster_architecture/cluster_planning) states:
"On average, raw data is about two times larger on disk after it is loaded into the database, but could be much smaller or larger depending on the characteristics of your data and tables. " .
Well, what about compression ? It seems like this is not taken into account at all - this other link:
http://www.datastax.com/docs/1.2/operations/compaction_compression#compaction-compression
states "2-4x reduction in size" with compression on.
My question is: does the cluster planning documentation (first link) take into account compression (second link) by default or not ??
Documentation probably needs to be updated.
Thanks,
Matt