What is Statistics: Definition and 998 Discussions

Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments.When census data cannot be collected, statisticians collect data by developing specific experiment designs and survey samples. Representative sampling assures that inferences and conclusions can reasonably extend from the sample to the population as a whole. An experimental study involves taking measurements of the system under study, manipulating the system, and then taking additional measurements using the same procedure to determine if the manipulation has modified the values of the measurements. In contrast, an observational study does not involve experimental manipulation.
Two main statistical methods are used in data analysis: descriptive statistics, which summarize data from a sample using indexes such as the mean or standard deviation, and inferential statistics, which draw conclusions from data that are subject to random variation (e.g., observational errors, sampling variation). Descriptive statistics are most often concerned with two sets of properties of a distribution (sample or population): central tendency (or location) seeks to characterize the distribution's central or typical value, while dispersion (or variability) characterizes the extent to which members of the distribution depart from its center and each other. Inferences on mathematical statistics are made under the framework of probability theory, which deals with the analysis of random phenomena.
A standard statistical procedure involves the collection of data leading to test of the relationship between two statistical data sets, or a data set and synthetic data drawn from an idealized model. A hypothesis is proposed for the statistical relationship between the two data sets, and this is compared as an alternative to an idealized null hypothesis of no relationship between two data sets. Rejecting or disproving the null hypothesis is done using statistical tests that quantify the sense in which the null can be proven false, given the data that are used in the test. Working from a null hypothesis, two basic forms of error are recognized: Type I errors (null hypothesis is falsely rejected giving a "false positive") and Type II errors (null hypothesis fails to be rejected and an actual relationship between populations is missed giving a "false negative"). Multiple problems have come to be associated with this framework, ranging from obtaining a sufficient sample size to specifying an adequate null hypothesis. Measurement processes that generate statistical data are also subject to error. Many of these errors are classified as random (noise) or systematic (bias), but other types of errors (e.g., blunder, such as when an analyst reports incorrect units) can also occur. The presence of missing data or censoring may result in biased estimates and specific techniques have been developed to address these problems.

View More On Wikipedia.org
  1. X

    Help with understanding a solution in statistics

    http://kaharris.org/teaching/425/Lectures/lec37.pdf "Fifty numbers are rounded-off to the nearest integer and the summed. Suppose that the individual round-off errors are uniformly distributed over (0:5; 0:5). What is the probability that the round-off error exceeds the exact sum by more than...
  2. T

    Covariance betw scalar amplitude & spectral index in Planck?

    I am reading some of "Planck 2013 results. XXII. Constraints on inflation." The paper is full of values for various inflationary parameters under various models, with their confidence intervals. For instance, in Table 5 on page 13, the authors report that — for a model including both running of...
  3. A

    Ti-89 Calculator question (Statistics)

    Homework Statement Calculate the mean and standard deviation for the random variable X which equals the sum of the dots facing upward when two standard six-sided dice are rolled. The Attempt at a Solution In List 1, I entered all the possible values of the random variable X...
  4. EternusVia

    Income Distribution Data Sources for Statistical Analyses

    Hello all! I'm hoping you will be able to direct me to sources of raw income1 / employment data. Hopefully this thread can be a help to others who have similar needs in the future, as well. (I did a quick search for a similar thread and didn't find anything - please correct me if I'm wrong.)...
  5. D

    Roulette Problem: Solving Z Distribution & Win/Loss Probabilities

    Hey guys long time since I've posted on here, got this random question a friend of mine and myself were wondering over. 1. Homework Statement A player bets 100 dollars continuously on any number besides 0 (european roulette so 1/37 chance to win). He plays until he wins once and stops playing...
  6. Z

    Properties of Univariate statistics.

    Hi PF, i have several questions about univariate statistics that doesn't seem to be covered in my notes or online, i hope the question is not redundant on the forums, but i ran a search and saw nothing. In univariate statistics, you can have a PMF which is a discrete random variable (RV) and a...
  7. R

    Fun maths question, correlation?

    Say i m in the cinema and there are 4 seats. say the closest seat is the best view which is 4 meters away. say the further the distance away the worse the viewing quality. eg [5,6,4,9] are the distances of seats away from screen. Now let's say people sitting in front of you also effects viewing...
  8. AntSC

    S1 Probability Coin Toss

    Having trouble with certain binomial and geometric distribution questions, which is indicating that my understanding isn't completely there yet. Any help would be greatly appreciated. 1. Homework Statement A bag contains two biased coins: coin A shows Heads with a probability of 0.6, and coin...
  9. C

    Probability Conditional Expectation

    Suppose X and Y are independent Poisson random variables with respective parameters λ and 2λ. Find E[Y − X|X + Y = 10]3: I had my Applied Probability Midterm today and this question was on it. The class is only 14 people and no one I talked to did it correctly. The prof sent out an e-mail saying...
  10. A

    Most employable emphasis for Applied Statistics major?

    http://oi59.tinypic.com/50lfv9.jpg Above is a link to the list of the three different emphasis offered at my university for statistics major (it is named "statistics" major but has an "applied statistics" curriculum). Please tell me which emphasis of the three you think is most employable...
  11. W

    Certifying Knowledge of Statistics

    Hi, just curious, I have taught myself a good chunk of statistics , up to non-parametric methods, but I don't have any official proof of it, i.e., I don't have a degree on it I could use to convince a potential employer. I know in other areas one may use, e.g., the GRE to show one's competence...
  12. icecubebeast

    Best AP Statistics Textbooks for Acing the Exam in 3 Months

    Hello, I want to take the AP Statistics test within 3 months and I need to have a statistics book to start preparing for it. What books do you recommend for any student who needs all the necessary information to study for an AP Statistics exam and be able to pass with a 5? By book I mean...
  13. K

    Exploring the Best Method for Finding the Mode in Statistical Data

    till now i used the normal method for finding out the mode of a given data that is just simply look for the most frequently occurring observation and label it as the mode. But recently I have encountered another method for finding out the mode in which it was also stated that my old method for...
  14. S

    MHB Statistics - Finding a relationship?

    let a and b be constants and let y_j = ax_j+b for j = 1,2...n. What are the relationships between the means of y and x, and the standard deviations of y and x? I'm not sure what they are wanting here?
  15. D

    How to Calculate the Mean Fraction of Occupied Seats in a Row?

    There are a set of kids (let's say N=50) asked to sit in a row of seats, leaving at least one empty seat between them until all seats are filled. At the end, how do I calculate mean of the fraction of occupied seats? What will be the input to calculate mean?
  16. C

    MHB More statistics: counting problem

    "For years, telephone area codes in the U.S. and Canada consisted of a sequence of three digits. The first digit was an integer between 2 and 9, the second digit was either 0 or 1, and the third digit was any integer from 1 to 9. (1) How many area codes were possible? (2) How many area codes...
  17. sheldonrocks97

    Probability that a Year with 53 Sundays is a Leap Year?

    Homework Statement A year has 53 Sundays. What is the conditional probability that it is a leap year. Homework Equations None that I can think of. The Attempt at a Solution I tried by knowing that a leap year has 366 days. Next we can note that the remaining 2 days could be sunday and monday...
  18. P

    Prob/Stats What Book Best Explains the Distribution of Combined Random Variables?

    Hello everyone. Recently I've been working in a project involving some statistics, and I've found that I lack knowledge about this. So I'm looking for a book about probability and statistics, but I'm specially interested in how one can get the distribution of a combination of others...
  19. MidgetDwarf

    Prob/Stats Introduction to statistics and number theory books

    I am inquiring about a good introductory statistics book or books, that supplement each other well. My math background consist of calculus 2, linear algebra, and ODE. This is for a first course in statistics. Also, what would be a good introductory number theory book? Or should I complete...
  20. C

    Is Taking AP Statistics Necessary for a Physics Education?

    Hello. So I'm graduating this year and will be attending university to study physics and math. AP courses my school offers include AP Calculus and AP English. I did Calculus last year, and this year will be taking AP English. My school doesn't offer AP Chemistry, but (I live in British Columbia)...
  21. Mr Davis 97

    Relation between variables and distributions in statistics

    I am a little confused about how variables are related to distributions as one moves from descriptive statistics to inferential statistics. I know that a variable in descriptive statistics is some measurable characteristic of some phenomenon, and its distribution is some description (table or...
  22. N

    Simple question about sets in statistics

    Let's say you have 100 tickets of type A, and 100 tickets of type B in a box. Let's also say the probability to draw ticket A, for whatever reason, is twice that to draw ticket B. Is this problem, for all intents and purposes, mathematically equivalent to having 200 type A tickets and 100 type...
  23. N

    Programs Statistics vs. Astronomy for Physics Major

    I am currently signing up for classes next semester. I am already taking Linear Algebra, Physics 2 + Lab, and a required writing course. As my fourth choice I can either take Astronomy + Lab or Introduction to Probability and Statistics. The pros of statistics are that there is no lab, which...
  24. Soumalya

    Prob/Stats References to Probability & Statistics for Engineers

    I am looking for an introductory reference to the subjects 'Probability and Statistics' intended to be used for self study.By an introductory reference I mean an undergraduate level text that would teach me the subjects from the very basic to an advanced level.I wish to focus primarily on...
  25. B

    Calculating mean from 5 number summary

    It seems like it should be possible to calculate the mean (usual average) from a 5-number summary of a set of numbers (min, first quartile or Q1, median, third quartile or Q3, and max). You should be able to calculate roughly what a percentile is, then by taking each discrete percentile and...
  26. B

    Using approximations to the binomial distribution

    Homework Statement This is the problem I am given. . It is in he picture below or in the thumbnail. I was also told that since ##n## is big enough that I can use normal approximations. Homework EquationsThe Attempt at a Solution I think that ##C_{\alpha}=C_{0.1}=2.33## which I got off the...
  27. Q

    Knee deep in Bayesian statistics and

    Hello all, To just begin, I am having a lot of trouble keeping my brain in the Bayesian view and not letting it revert back to a Frequentist way of thinking. Not to mention having troubles with unbinned MLE estimation. If I say something wrong, please let me know. My questions are...
  28. A

    Use of statistics in experiment

    I have seen that "the best estimate for the random error σ(X) in a single measurement is given by σ(X)2 ≈ 1/(n-1) * ∑((xi-μ)2) where the sum is over all i" I have two questions about this: firstly, how can this pertain to a "single measurement" if it requires the data from multiple...
  29. C

    Philosophical question about central limit theorem

    Well, this is probably a stupid question, but I don't see why (yet). Let Xi be random variables identically distributed, with mean 0, such that the cumulative distribution is = 0 for all -1 < x < 1. So, I believe it is clear that for all n, the cumulative distribution of Z = (X1 + X2 ... Xn)/n...
  30. F

    Standard deviation of m sample means of n observations each

    Homework Statement "Consider a standard uniform density. The mean for this density is .5 and the variance is 1 / 12. You sample 1,000 sample means where each sample mean is comprised of 100 observations. You take the standard deviation of the 1,000 sample means. About what number would you...
  31. M

    Probability and statistics book

    Hello :-) I need a book on probability and statistics which includes the topics: Probability Events and probability space Independent and depdependent events Bayes' theorem Combinatorics Mean, median, mode Range, interquartile range Any ideas?
  32. 9

    Two measurements from different sources - how to combine?

    I can get a measurement of my distance between points (x1,y1,z1 and x2,y2,z2) by analysing position data from a GPS/barometer system, which has a standard deviation of about 2m for x y z positions. I can also analyse data from a system that provides velocity with a standard deviation of about...
  33. W

    Statistics vs computer science

    i have read course descriptions for both, but i want to hear honest opinions from people who actually are in, or have finished, either program. what is it like? i know they're both heavy on mathematics and that is fine because i like mathematics. im currently learning java whenever i get spare...
  34. W

    Statistics over computer science?

    i put engineering as my primary choice for university application and computer science as my second. however, I am afraid that my grade might not be high enough to get into the engineering school, and I am hearing a lot of negative things about computer science. i have been thinking of other...
  35. B

    Difficult computational statistics problem

    I've got a tricky computational statistics problem and I was wondering if anyone could help me solve it. Okay, so in your left pocket is a penny and in your right pocket is a dime. On a fair toss, the probability of showing a head is p for the penny and d for the dime. You randomly chooses a...
  36. B

    Finding the conditional distribution

    Hey guys, I'm trying to find a conditional distribution based on the following information: ##Y|u Poisson(u \lambda)##, where ##u~Gamma( \phi)## and ##Y~NegBinomial(\frac{\lambda \phi}{1+ \lambda \phi}, \phi^{-1})## I want to find the conditional distribution ##u|Y## Here's what I've got so...
  37. Mogarrr

    Contingency Table Interpretation

    Homework Statement 10.24 Is there any relationship between the type of treatment and the response? What form does the relationship take? Here's that data (column variables are responses): \newcommand\T{\Rule{0pt}{1em}{.3em}} \begin{array}{|c|c|c|c|c|} \hline Treatment & +Smear &...
  38. Mogarrr

    Biostatistics, help deciding on a test

    Homework Statement Refer to Table 2.11. 10.6 What significance test can be used to assess whether there is a relationship between receiving an antibiotic and receiving an antibiotic culture while in the hospital? Here's my attempt of recreating the table: \newcommand\T{\Rule{0pt}{1em}{.3em}}...
  39. Q

    Is it possible to create order from disorder in a thermodynamic system?

    Hi all I hope you can help me with the statistical origins of the Second Law. I cannot find anything that mathematically proves that order from disorder is impossible only improbable. Leading me to think that a system (Kelvin engine) that allows order to be created from disorder (work from...
  40. Schnurmann

    Least Squares Estimation - Problem with Symbols

    Hi folks, 1. Homework Statement I don't fully understand the question statement, how is it supposed to be read? Question: Give a formula for the minimizer x* (to be read as x-star) of the function ƒ:ℝn → ℝ, x → ƒ(x) = ||Ax-b||22, where A∈ℝm×n and b∈ℝm are given. You can assume that A has rank...
  41. M

    Can someone recommend me a good Probability and Statistics textbook?

    I want to take an introductory course on Probability and Statistics but I want a good textbook that has lots of practice problems for this course. Please tell me the name of the textbook, author and edition.
  42. M

    Should I take Probability and Statistics?

    I've never taken a course in Probability and Statistics. In high school, I've taken AP Statistics but got F's and got a 1 on the AP exam. I have taken Calculus I, Calculus II, Calculus III, Linear Algebra, and currently learning Differential Equations. I want to pursue Computer Science and...
  43. V

    What is the Best Way to Determine Sampling Accuracy in Large Populations?

    Good afternoon! Suppose I have a box with N marbles of different color and I want to know the ratio of the number of green ones to N (number X). The number of marbles (N) is so huge that there`s absolutely no way to get them all out of the box and count. What I do instead is I take a sample of...
  44. T

    MHB How is Probability Applied in Newspaper Reading Time Statistics?

    The time spent in minutes in reading newspapers for an adult per day can be approximated by a normal distribution with a mean of 15 minutes and a standard deviation of 3 minutes. A-Find the probability that the reading time per day for a randomly selected adult is more than 18 minutes B-If 200...
  45. R

    Seeking recommendations on statistics textbooks

    Hello. I'm currently taking a lower level course in statistics. It is an okay course; however, it is not very rigorous and my professor is a "this is what you need to know" sort of teacher versus one who explains the theory and reasons behind the equations as well. The required textbook for...
  46. P

    How Do You Compute Value at Risk for a Two-Year Investment with Compound Growth?

    Homework Statement Find the VaR for an investment of $500,000 at 1% given that the investment is expected to grow 10% every year with standard deviation of 35% and that the investment is held for two years. Homework Equations E(X + Y) = E(X) + E(Y) E(X*Y) = E(X) * E(Y) (for independent...
  47. muraii

    Open-Source Curricula for Self-Study & Collaboration

    A few months ago I asked for recommendations for textbooks on generalized linear models. (Nevermind that I seem not to have actually asked anything.) I'm still in the same boat but am considering a different approach. I mention in that I felt approaching practical (mostly work) and theoretical...
  48. qspeechc

    Testing Randomness in a Set of 200+ Data Points

    Hi everyone. It's been years since I've done any stats, so I need a bit of help, please. I want to include it in a blog post I'm going to do (not here on PF), so I don't want to give away too many details :p I apologise for my terrible understanding of stats, please be patient! Anyway, over...
  49. F

    Please Recommend Me a Statistics Book based on Matrices

    Hi, can you guys recommend a book for an undergrad that uses matrices properties to explain statistics? Thx!
  50. X

    How Likely Is It That Mechanics Take Over 8.7 Hours to Rebuild a Transmission?

    Mod note: This post was moved from another forum section, so doesn't use the homework template. A study of the amount of time it takes a mechanic to rebuild the transmission for a 2005 Chevrolet Cavalier shows that the mean is 8.4 hours and the standard deviation is 1.8 hours. If 40 mechanics...