What is Statistics: Definition and 998 Discussions
Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments.When census data cannot be collected, statisticians collect data by developing specific experiment designs and survey samples. Representative sampling assures that inferences and conclusions can reasonably extend from the sample to the population as a whole. An experimental study involves taking measurements of the system under study, manipulating the system, and then taking additional measurements using the same procedure to determine if the manipulation has modified the values of the measurements. In contrast, an observational study does not involve experimental manipulation.
Two main statistical methods are used in data analysis: descriptive statistics, which summarize data from a sample using indexes such as the mean or standard deviation, and inferential statistics, which draw conclusions from data that are subject to random variation (e.g., observational errors, sampling variation). Descriptive statistics are most often concerned with two sets of properties of a distribution (sample or population): central tendency (or location) seeks to characterize the distribution's central or typical value, while dispersion (or variability) characterizes the extent to which members of the distribution depart from its center and each other. Inferences on mathematical statistics are made under the framework of probability theory, which deals with the analysis of random phenomena.
A standard statistical procedure involves the collection of data leading to test of the relationship between two statistical data sets, or a data set and synthetic data drawn from an idealized model. A hypothesis is proposed for the statistical relationship between the two data sets, and this is compared as an alternative to an idealized null hypothesis of no relationship between two data sets. Rejecting or disproving the null hypothesis is done using statistical tests that quantify the sense in which the null can be proven false, given the data that are used in the test. Working from a null hypothesis, two basic forms of error are recognized: Type I errors (null hypothesis is falsely rejected giving a "false positive") and Type II errors (null hypothesis fails to be rejected and an actual relationship between populations is missed giving a "false negative"). Multiple problems have come to be associated with this framework, ranging from obtaining a sufficient sample size to specifying an adequate null hypothesis. Measurement processes that generate statistical data are also subject to error. Many of these errors are classified as random (noise) or systematic (bias), but other types of errors (e.g., blunder, such as when an analyst reports incorrect units) can also occur. The presence of missing data or censoring may result in biased estimates and specific techniques have been developed to address these problems.
I 'am stuck at , what fraction of the shoes were size 4 I know the frequency is missing. The frequencies are 8,9,9,3,1.
My thinking you how to add the frequencies and put 4 over it. Explain to me if I'm wrong.
Thanks in advance
Hello,
I am trying to wrap my head around the difference between machine learning and statistics for predictive purposes and interpretability...
Is there a sharp difference between the two in terms of predictive power? I understand how machine learning needs to be first trained with data to...
I'm not really sure where to put this. I came across this nice article on Medium. “The 5 Basic Statistics Concepts Data Scientists Need to Know” by George Seif https://link.medium.com/APtnCOapOV
The data represent the time, in minutes, spent reading a political blog in a day. Construct a frequency distribution using 5 classes. In the table, include the midpoints, relative frequencies, and cumulative frequencies. Which class has the greatest frequency and which has the least...
I was told to generate these variables (m, C, alpha, wind velocity) normally distributed and compare the random data with the result and then tell, which of the variables has the most impact. Here I am stuck, tried to compare variances, kurtosis and skewness of the data (the original variables...
I have collected a large number of oscilloscope traces (these relate to the modes of a laser cavity). Four sets are shown here:
There are two ways for finding the average spacing ##\Delta t## between the adjacent dips:
1. Measure all the individual ##\Delta t##s from each measurement...
Monthly Cycle numbers
Here is the cycle ratio:
$$2_{early}:2_{fertile}:1_{late}$$
And the numbers:
$$20,000_{early}:20,000_{fertile}:10,000_{late}$$
Now, let's divide the early into 2 groups, pre-fertile, and safe and assume there is a 50/50 split between those 2 groups. Let's also assume...
Hello, I'm not familiar with basic statistics exercises using pie chart and I'd like to know how i can solve this one: in a faculty science of 3000 students, some are in biology, some in physics and some in chemistry as shown on the pie chart. I can for sure tell that 1500 students are in...
Homework Statement
Hello, I was given 2 sets of data, showing 20 temperature values and 35 temperature values respectively. The data sets look like below:
Data 1 Data 2
Temperature Temperature
30.9...
**Reposting this again, as I was asked to post this on a homework forum**
1. Homework Statement
Hi,
I am trying to solve this math equation (that I found on a paper) on finding the variance of a noise after passing through an LTI system whose impulse response is h(t)
X(t) is the input noise...
Phil Plait, creator of Bad Astronomy, has an article on Planet 9. Overall, it's pretty good, but there was one part that got my hackles up:
It may be more clear, but at a cost of being more wrong.
Simple linear regression statistics:
If I have a linear relation (or wish to prove such a relation): y = k x where k = constant. I have a set of n experimental data points ...(y0, x0), (y1, x1)... measured with some error estimates.
Is there some way to present how well the n data points shows...
I Know this is prob the wrong site to post this but...
Hello, I am a student at a low-ranked college in New York State actively pursuing a bachelors (BA) in Math in my junior year. I have a 3.7 GPA overall and a 3.73 in Math. I am looking to apply to PHD programs next year in Statistics or in...
Hi,
I am trying to solve this math equation on finding the variance of a noise after passing through a system whose impulse response is h(t)
X is the input noise of the system and Y is the output noise after system h(t)
if let's say variance of noise Y is
σy2=∫∫Rxx(u,v)h(u)h(v)dudv
where...
I am learning for my exam in particle physics. One topic is statistical physics. There I ran into this question:
Consider an atom at the surface of the Sun, where the temperature is 6000 K. The
atom can exist in only 2 states. The ground state is an s state and the excited state at
1.25 eV is a...
Homework Statement
Consider the exponential probability density function with location parameter ##\theta## :
$$f(x|\theta) = e^{-(x-\theta)}I_{(\theta,\infty)}(x)$$
Let ##X_{(1)}, X_{(2)},...,X_{(n)}## denote the order statistics.
Let ##Y_i = X_{(n)} - X_{(i)}##.
Show that ##X_{(1)}## is...
<Moderator's note: Moved from a homework forum.>
Mass (g) +/- 0.01 grams Drop height (centimeters) +/- 3.00 Shell
53.47 45 No crack
56.78 45 Cracked...
Homework Statement
[/B]
I am trying to determien the characteristic function of the function:
$$ f(x)= ae^{-ax}$$
$$\therefore E(e^{itx}) =\int_0^\infty e^{itx}ae^{-ax} dx = a \cdot \frac{e}{it-a} |_0 ^ \infty $$
But I am not sure how to evaluate the integral.
Wolfram alpha suggests this...
Hey There! I've finished reading The Introduction to Probability and Statistics book. Now I'm looking for another book as awesome as that was. Can anyone suggest me a book? Thanks in advance.
Hello everyone. I'm having trouble understanding this example: https://ecee.colorado.edu/~bart/book/book/chapter2/ch2_5.htm#2_5_2
In this system of 20 electrons with equidistant energy levels, how is it known that there are only 24 possible configurations, and how are those configurations found?
i read that question and Weinberg book (A,B)-Representation of Lorentz Group: Coefficient functions of fields
why u(a,b)=Cab/sqrt(2m) ?
where Cab is Clebsch-Gordan coefficients, and m is mass of particle
Homework Statement
[/B]
Given a group of 100 married couples, let X1 be the number of sons and X2 the number of daughters the couple has.
P(X1 = 0, X2 = 2) = f(0, 2) = 8 /100 = 0.08
2. Homework Equations The Attempt at a Solution
I tried to look for a similar example online, I found this...
Many times, the charge carrier density of a material is determined from a Hall effect experiment, from ##R_H=1/(ne)## (SI units). Where ##R_H## is determined from a measured voltage and other controllable parameters. As far as I know, this simple formula comes from the obsolete Drude's model...
Homework Statement
[/B]
For reference:
Book: Mathematical Statistics with Applications, 7th Ed., by Wackerly, Mendenhall, and Scheaffer.
Problem: 10.81
From two normal populations with respective variances ##\sigma_1^2## and ##\sigma_2^2##, we observe independent sample variances ##S_1^2## and...
I am trying to use PCA to classify various spectra. I measured several samples to get an estimate of the population standard deviation (here I've shown only 7 measurements):
I combined all these data into a matrix where each measurement corresponded to a column. I then used the pca(...)...
Homework Statement
Given an interaction Lagrangian $$ \mathcal{L}_{int} = \lambda \phi \bar{\psi} \gamma^5 \psi,$$ where ##\psi## are Dirac spinors, and ##\phi## is a bosonic pseudoscalar, I've been asked to find the second order scattering amplitude for ##\psi\psi \to \psi\psi## scattering...
Homework Statement
[/B]
I have solve the rest of this problem pretty easily and see no problems with working with Indistinguishable particles, Distinguishable particles, fermions and Bosons. Part c has me very confused though about what it is even asking.
Suppose a system with equally spaced...
I am current in the second year of college, doing physics but from the mid of the year I got hooked to data science and I decided I want to follow this career. What is the best thing I can do now? Finish my BSc and go for a MSc in statistics? Change my major to statistics? As I still like...
Homework Statement
Let ##U_1, U_2, U_3## be independent uniform on ##[0,1]##.
a) Find the joint density function of ##U_{(1)}, U_{(2)}, U_{(3)}##.
b) The locations of three gas stations are independently and randomly placed along a mile of highway. What is the probability that no two gas...
Homework Statement
Lets say I have a list of numbers.
income=[17000, 11000, 23000, 19999, 21000, 10000]
I sort them income_sorted=[10000, 11000, 17000, 19999, 21000, 23000]
Calculate med 2nd Quartile.
Homework Equations
Median_formula = (n+1)/2
The Attempt at a Solution
The second...
https://arstechnica.com/science/2018/10/fixed-mindsets-might-be-why-we-dont-understand-statistics/
The article outlines our difficulty of understanding statistics in everyday problems or while on juries.
Homework Statement
Suppose that the number of eggs laid by a certain insect has a Poisson distribution with mean ##\lambda##. The probability that anyone egg hatches is ##p##. Assume that the eggs hatch independently of one another. Find the expected value of ##Y##, the total number of eggs...
Hello all, first post. I'm interested in granular load factor statistics for the Palo Verde plant, and LF reporting practices in general. Reviewing the LF stats for PV in the PRIS DB, I'm a bit confused. The annual LF in 2002 was 102. PRIS indicates that 102 is a percentage. What I would...
Homework Statement
[/B]
##-1\leq\alpha\leq 1##
##f(y_1,y_2)=[1-\alpha\{(1-2e^{-y_1})(1-2e^{-y_2})\}]e^{-y_1-y_2}, 0\leq y_1, 0\leq y_2##
and ##0## otherwise.
Find ##V(Y_1-Y_2)##. Within what limits would you expect ##Y_1-Y_2## to fall?
Homework Equations
N/A
The Attempt at a Solution...
Homework Statement
Homework Equations
Chebyshev's Theorem: The percentage of observations that are within k standard deviations of the mean is at least
100(1 - (1/k2))%
Chebyshev's Theorem is applicable to ANY data set, whether skewed or symmetrical.
Empirical Rule: For a symmetrical...
Hey all,
Joined a Material Science program and I have a course on Multivariate Stats. I have pretty minimal background on the topic so I thought I would pop back here asking for recommendations for a good start on the topic.
Here's the course description -
Homework Statement
For a project, I am trying to estimate the absorption curve of a certain plant species. Due to the variability within the species, I have taken the average of several measurements.
In the figure below, the blue curves represent 3 separate measurements, and the red curve is...
This may be better suited in the academic forum, or possibly not even the normal type of question asked, but I was just judging based on other similar posts.
I just graduated from college this past spring with a BS in Applied Mathematics and a BS in Physics, as well as a minor in computer...
I am trying to write a program that calculates the root of chi-square. I am not getting the correct answer and I honestly am at my wits end trying to figure it out. I know my simp_p() method is returning the correct value, but for some reason my root_chisq() method is not giving me the correct...
Suppose I have the random variables ##Z_k=X_k/Y_k## with a PDF ##f_{Z_k}(z_k)## for ##k=1,\,2\,\ldots, K##, where ##\{X_k, Y_k\}## are i.i.d. random variables. I can find
\text{Pr}\left[\sum_{i=1}^3Z_k\leq \eta\right]
as...
Hello! New here. Sort of. I have a major in math. I'd like to specialize in data science for my master's degree. This is my 2nd account here. I don't know how to delete the other one. Can anyone tell? Thanks.
typical random walk :
one step forward or backward with equal probability and independence of each step , what is the expectation and Variance .
so i define indicator variable xi ={1 or -1 with equal probabilty .
E(xi) = 0
Var(xi) = 1
now define Sn as the sum of i=1,...,n
each step is...
Hello,
I would be interested in a collection of experimental data for the Michelson-Morley Experiment .
I would like to see if there would be many data available, and if a statistical analysis could be of some fun.
Would you know some compilation of data?
Thanks,
Michel
to begin with I am a biophysicist so my question is very naive.
It is my understanding that the Uncertainty Principle deals with a single event (particle). It is also my understanding that quantum physics contains a lot of statistics (probability).
The question is: in the area of...
Homework Statement
Hello! I'm trying to understand how to solve the following type of problems.
1) Random variables x and y are independent and uniformly distributed on the interval [0; a]. Find probability density function of a random variable z=x-y.
2) Exponentially distributed (p=exp(-x)...
Hi everyone, I recently got a job through my instructor to design labs for a new course we are offering. It is Engineering Statistics for Electrical and Computer engineering. We will be using Yates text. We previously had to take a engineering stats class in the math department that is more...
Hi, I have a very basic statistics homework question, and I have given it a go, but I just wanted to know if I had approached it correctly:
QUESTION
A shop adjusts its staffing levels using a model of customer demand through the week in the ratio of: 0.22(Mon); 0.14(Tue); 0.16(Wed); 0.18(Thu)...
Suppose that I have these random variables ##\eta_k=\alpha_k/\beta_k## for ##k=1,\,2,\,\ldots,\,K##, where ##\{\alpha_k,\,\beta_k\}## are i.i.d. random variables. Now suppose that I select ##M\leq K## random variables such that denominators are the largest ##M## random variables. That is...