To return to a previous example, we make our judgments of someone’s personality on the basis of a very small sample of his or her behavior and ex- pect this person to behave in similar ways in the future when we encounter further samples of behavior. We are surprised, and sometimes indignant, when the future behavior does not match our expectations.
By making our samples sufficiently large, we can guard against distortions due to “runs of luck,” but even very large samples can give us a poor basis for a statistical generalization. Suppose that Harold has tried hun- dreds of times to use a Canadian quarter in an American payphone, and it has never worked. This will increase our confidence in his generalization, but size of sample alone is not a sufficient ground for a strong inductive ar- gument. Suppose that Harold has tried the same coin in hundreds of differ- ent payphones, or tried a hundred different Canadian coins in the same payphone. In the first case, there might be something wrong with this par- ticular coin; in the second case, there might be something wrong with this particular payphone. In neither case would he have good grounds for mak- ing the general claim that no Canadian quarters work in any American pay- phones. This leads us to the third question we should ask of any statistical generalization.