A data lake, enabled by the open-source software Hadoop, is simply a collection of information in its raw format. Rather than process the data and file it away in a rigid way, GE and Pivotal are storing it in its original form and sifting through it when needed.

I’m not yet sure we need another bit of jargon but the principle’s a good one. You often don’t know what you need until you look for it and find it’s gone.