The German poet and essayist Fredrick Schiller wrote in his poem "Sentences of Confucious" [my translation, based on a long-past "German for Scientists" course with some help from Google Translate]:
The measure of space is threefold:
restless, unceasing, free
Length strives, it shoots afar;
Breadth pours across the endless sea;
Bottomless depth descends past stars.
They are a picture of your goal:
Strive forward with your being whole,
Do not stand still, and do not tire,
To see at last creation's fire;
You have to open up your mind,
Re-form yourself in this world's shape;
You must go deep to open your eyes
Only persistence will lead to the prize
Only long study achieves what persists,
For truth dwells in the abyss.
Schiller wrote in the late 1700s, and while he isn't entirely wrong in his advice to work as hard as humanly possible if we want to get anywhere with understanding reality, the modern version would end more like this:
Only long study achieves contributions
For knowledge lies in distributions.
Is it better poetry? Eh. But it's certainly a better way of creating knowledge.
Individual facts are like puzzle-pieces, and if we only have a few of them it's very difficult to get any idea of what the picture actually might be.
If we have lots and lots, though, we can start making inferences, and we can do it without assembling the puzzle. This is important, because when it comes to reality, we can't ever simply "look at" the puzzle, especially when it comes to the abstract causes that dominate the world, like force and chance and selection effects.
We may never see the "thing in itself", as Kant put it, but we can make strong, well-justified inferences about reality based on the puzzle pieces we have. If a puzzle contains a lot of blue, for example, it's probably (not certainly) and outdoor scene with lots of sky or water. A lot of green suggests a forest or garden scene. With a careful study of puzzles of different kinds we could learn how to draw conclusions from different distributions of colours among the pieces.
Philosophers might claim that there were "crucial pieces" that "proved" a puzzle was one thing or another, but in time people who actually look at reality rather than their imagination would find exceptions to those rules.
Puzzle pieces are facts in this analogy, and "pictures" are conceptual or mathematical models of reality, which we have in abundance. The problem, as always, is finding out which model is the best match to the distribution of puzzle pieces we have.
Puzzle pieces don't come to us unheralded, and as such we often have expectations of what distribution we're likely to see as we start collecting them.
Those expectations often turn out not to be a very good match for what we actually see in the world, which is when things become fun.
Consider, for example, the distribution of dinosaur sizes.
Both dinosaurs and mammals were/are the dominant megafauna of their eras, so we might expect their size distributions would be similar. Nope.
Modern mammals have a distribution of sizes (masses) that peaks at about 100 g and then tails off fairly smoothly toward elephants at some thousands of kilograms. The average land mammal is a chipmunk.
Dinosaurs--based on the fossil record--had a quite different distribution, in two respects. First: there were more big dinosaurs than little ones, so their distribution skews toward the heavy end of the scale. Second: their distribution is getting on for bimodal, with two fairly distinct peaks: a big one in the thousand-plus-kilo range, and another at around 5 kg, with a pronounced dip in between.
How come?
That is: "Why are there so few mid-sized dinosaurs?" or "Why are there so many big dinosaurs?" Based on modelling work by Daryl Codron and colleagues at the University of Zurich and elsewhere there seems to be a single reason for both these features, which is always encouraging: invoking a different explanation for every feature of the world you encounter quickly starts to look like you're just making things up.
That reason is fundamentally due to the problem of eggs.
Dinosaurs, like birds and reptiles today, laid eggs. Mammals give birth to live young. Embryos developing inside egg shells need oxygen, which means the shell has to be somewhat porous. This limits the thickness of the shell, which limits the strength of the shell, which limits the maximum size of the egg, because the stresses on an eggshell increase as the size of the egg increases.
Given there is a fixed upper limit to the size of an egg but no fixed upper limit to the size of a dinosaur, this creates a situation where very large dinosaurs had eggs that were comparatively tiny, with the adults being over a thousand times the mass of the eggs! For mammals this ratio is well under a hundred to one.
A creature that is born small and destined to grow big--if it lives--needs to grow quickly, and in doing so it passes through all the masses between birth and adulthood. A carnivorous dinosaur that hatches out of its egg at a few kilograms needs to pack on mass quickly if it's going to survive, which means it needs to be an aggressive predator while it's 10, 20, 50, 100, 200, 500, 800, 1200... kg.
Anything trying to live a quiet adult life in those mass ranges is going to get eaten, and the Codron modelling suggests that such creatures would be likely driven to extinction, particularly because those big carnivores were likely laying a lot of eggs in each clutch.
Based on teeth and defences we tend to think of the world of the dinosaurs as particularly violent and dangerous. We can now see this as a necessary consequence of large carnivores laying small eggs, and laying waste to everything between themselves and their newborn young.
That's what the distributions are telling us, even though we were not there and never can be. Knowledge lies in the distributions.