1 What is statistics?
[1] How many fish are in this fishbowl?
[2] Is it possible for goldfish to be of different colors from one another?
[3] Is it possible for goldfish not to be gold-colored?Are these questions about data?Are these statistical questions?
No. That's analytics, not statistics.
[1] Statistics is about going beyond the data.
[2][3] You can answer the questions with certainty.
So what is statistics?
Statistics is the science of making decisions under uncertainty.
—Savage, The foundations of Statistics, 1954
I like to think of statistics as the science of changing your mind.
Bayesian statisticians change their minds about beliefs. They like to report results using credible intervals.
Frequentist (classical) statistician change their minds about actions.
2 Population, Sample, Observation
Population: the collection of all items we are interested in(for your decision).
Sample: any subset of the population of interest
Observation: any single measurement in the sample
A legal contract in Statistics: The truth (population), the whole truth (population) and nothing but the truth (population).
3 What is statistic?
statistic: a summary measure computed from a sample
(anyway of mushing up data.)
Statistics: the science of changing your mind under uncertainty.
4 Proof that statistics are boring
By definition, only the population is interested to you.
We took some boring trees, and get some data of the boring trees, and we mash up these data. What comes out must be boring.
The Digestion of Statistics ==)Statistics
Statistics digests your statistic into something that helps you take a reasonable course of action involving the population.
5 Parameter
parameter: summary measure of a population
statistics: a summary measure computed from a sample
When you have all the data you're interested in, the sample is the population. You've got the answer with certainty.
Let's make a reasonable decision based on partial information.
6 Uncertainty
If you don't have all the information, you can't know the answer. That's what it means to be uncertain.
7 What is a hypothesis?
parameter: the number we wish we knew for our decision.
hypothesis: description of how the parameter might look
8 Estimate
The facts(data) we have:
Sample: subset of the population of interest
Statistic: summary computed from a sample
observation: a single measurement in the sample
The facts we wish we have:
Population: collection of all items we are interested in
Parameter: summary in a population
Attempts to bridge the gap:
hypothesis: description of how the parameter might look
Estimate: a best guess about the true value of a parameter
(method of moments estimation, maximum likelihood estimation...)
Question: Which of those two guesses, the first, based on one data point, or the second, based on all those, forms the better guess.
More data makes the estimate better! (as long as it's relevant data.)
But how much better? And is your best guess good enough?
TBC.