One problem with the Big Data paradigm is the lack of Big Data capable software engineers. Why is this? Let's check off the reasons:
- It is hard to reason about parallel computation. Steve Jobs was famous for saying that "nobody knows how to program [multi-core]. I mean two, yeah; four, not really; eight, forget it." Now take those eight cores and multiply it across tens, hundreds, or in some cases, thousands of machines. This is a difficult problem.
- There is a constantly changing ecosystem of tools, languages, and work flows. This leads to confusion among newcomers on which to learn and use, and which are irrelevant.