There’s a famous (well famous in Artificial Intelligence circles, anyways) Zen koan that goes like this:
In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6.
“What are you doing?”, asked Minsky.
“I am training a randomly wired neural net to play Tic-Tac-Toe,” Sussman replied.
“Why is the net wired randomly?”, asked Minsky.
“I do not want it to have any preconceptions of how to play”, Sussman said.
Minsky shut his eyes.
“Why do you close your eyes?”, Sussman asked his teacher.
“So that the room will be empty.”
At that moment, Sussman was enlightened.
Minsky was showing Sussman that the randomly-wired neural net (a complex model if ever there was one) actually did have preconceptions; it’s just that we don’t understand what these preconceptions are. So it is with large data sets. There is plenty to learn from them, both about the domains that the data come from, and about the methodology for learning from data.