Rocksolid Light

News from da outaworlds

mail  files  register  groups  login

Message-ID:  

You will be married within a year, and divorced within two.


sci / sci.stat.math / Re: statistics in Roberts. Was: RAW vs. raw image format

SubjectAuthor
o Re: statistics in Roberts. Was: RAW vs. raw image formatAnton Shepelev

1
Subject: Re: statistics in Roberts. Was: RAW vs. raw image format
From: Anton Shepelev
Newsgroups: alt.usage.english, sci.stat.math
Organization: A noiseless patient Spider
Date: Fri, 3 Mar 2023 20:33 UTC
References: 1 2 3 4 5 6 7 8 9 10 11
Path: eternal-september.org!news.eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: anton.txt@gmail.moc (Anton Shepelev)
Newsgroups: alt.usage.english,sci.stat.math
Subject: Re: statistics in Roberts. Was: RAW vs. raw image format
Date: Fri, 3 Mar 2023 23:33:25 +0300
Organization: A noiseless patient Spider
Lines: 135
Message-ID: <20230303233325.90ebe4980d7923974956b9bd@gmail.moc>
References: <f5a15ad4-4faf-440a-a59f-c5890d395961n@googlegroups.com>
<20230219220058.8d3d14741e18cce1bf19e256@gmail.com>
<51151e80-a719-46ef-8095-6535309e7d02n@googlegroups.com>
<20230220003936.ca90df6f8848a095271a0cbe@gmail.com>
<m35ybw2609.fsf@leonis4.robolove.meer.net>
<tt3eil$183th$2@dont-email.me>
<tt5fue$1iapr$1@dont-email.me>
<20230223193132.41882edd1d9110b60e745dac@gmail.moc>
<d7ufvhh40n67k40iqim6ikhnuil7luoavb@4ax.com>
<20230225001353.60271597ed5a42bec16e8d54@gmail.moc>
<0u3qvhlnu50kk3kg7e7jn6ujnene2fo8jk@4ax.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: reader01.eternal-september.org; posting-host="27b1624f0bafd55c5cf30a253eca8e7a";
logging-data="775412"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19d/POYlN6ztsqwx8wZ8gYhCFGMZvU5QN0="
Cancel-Lock: sha1:x4axBY55PWSgO/bVHR3lNWOG17s=
X-Newsreader: Sylpheed 3.7.0 (GTK+ 2.24.30; i686-pc-mingw32)
View all headers

Rich Ulrich:

> I've cross-posted to a .stat group that has a few readers
> left.

Sad to hear that. Usenet should be taught in school as one
of the last heteratchical, accessible, and independent
communication media.

> I read the citation, and I'm not very interested. - I know
> too little about the device, etc., or about the ongoing
> arguments that apparently exist.

This little knowledge has its advantages -- you could verify
the model for internal consistency and then comment whether
it can be a/the right model for /any/ imaginary experiment.
But since you are not interested -- good luck with whatever
occupations fill you with enthusiasm, and feel free to skip
my comments below:

> Modern statistical analyses and design sophistication for
> statistics were barely being born in 1933, when the Miller
> experiment was published. In regards to complications and
> pitfalls, Time series is worse than analysis of
> independent points; and what I think of as 'circular
> series' (0-360 degrees) is worse than time series. I once
> had a passing acquaintance with time series (no data
> experience) but I've never touched circular data.

Futhermore, there are no time readings in Miller's data.
Although he tried to rotate the device at a steady rate,
irregularities were unavodable. But mark you that Miller's
original analysis is largely of independent points, so that
whatever linear correction he might have applied could not
have affected the harmonical dependency of the fringe shift
upon device orientation.

> Also, 'messy data' (with big sources of random error)
> remains a problem with solutions that are mainly ad-hoc
> (such as, when Roberts offers analyses that drop large
> fractions of the data).

Yes. Futhermore, Roberts picked 67 of about 300 data sheets
from different experiments performed with at different
locations and dates, instead of the entire data from one or
two of the best ones from Mt. Wilson, with the most
prominent positive results. I forget how Roberts acquired
those sheets. If he had manually to type them into the
computer, this incompleteness may be excused. But knowing
the importance of this seminal experiment and of his new
analysis, he realy should have found the time, resources,
and help to digitise the entire data. Yet, he has not put
online even the partial data he has.

> Roberts shows me that these data are so messy that it is
> hard to imagine Miller retrieveing a tiny signal from the
> noise, if Miller did nothing more than remove linear
> trends from each cycle.

Does he show or tell? Do you comment on the graphs of
Miller data /after/ processing by his statistical model? It
is the model that I should like to understand better.

> I would want to know how the DEVICE made all those errors
> possible, as a clue to how to exclude their influence on
> an analysis.

This is an entirely different task -- an analysis of your
own -- perhaps more interesting and productive, but
impossible without Miller's original data. The device was a
large, super sensitive rotatable interferometer with two
orghogonal arms. The hypothesis tested was that, if the
Earth moved though the aether, the speed of light was
orientation-dependent, so that a half-periodic (in
orientation, not in time!) signal should be detected.

> If Miller's data has something, Miller didn't show it
> right.

Why do you think so?

> If you are wondering about how he fit his model, I can say
> a little bit. The usual fitting in clinical research (my
> area) is with least-squares multiple regression, which
> minimizes the squared residuals of a fit. The main
> alternative is Maximum Likelihood, which finds the maximum
> likelihood from a Likelihood equation.

Exactly, and I bet it is symbolic parametrised funtions that
you fit, and that your models include the random error
(noise) with perhaps assumtions about its distribution. No
so with Roberts's model, which is neither symblic nor has
noise as an explicit term!

> That is evaluated by chi-squared
> ( chisquared= -2*log(likelihood) ).
> Roberts seems to be using some version of that, though I
> didn't yet figure out what he is fitting.

I have a conjecture, and will discuss it with whoever agrees
to help me. With a my friend, a data scientist, we count
three people who find his explanation unclear.

> I thought it /was/ appropriate that he took the
> consecutive differences as the main unit of analysis,
> given how much noise there was in general. From what I
> understood of the apparatus, those are the numbers that
> are apt to be somewhat usable.

They /are/ usable in that they still contain the supposed
signal and less random noise (because of "multisampling").
But you will be surprised if you look at what that does to
the systematic error!

> Ending up with a chi-squared value of around 300 for
> around 300 d.f. is appropriate for showing a suitably
> fitted model -- the expected value of X2 by chance for
> large d.f. is the d.f. A value much larger indicates
> poor fit; much smaller indicates over-fit.

OK. My complaint, however, is about the model that he
fitted, and the way he did it -- by enumerating the
combinations of the seven free parameters by sheer brute
force. Roberts jumped smack dab into the jaws of the curse
of dimensionality where I think nothing called for it! He
even had to "fold" the raw data in two -- to halve the
degrees of freedom. I wonder what he would say to applying
that technique to an experiment with 360 measurements per
cycle!

Thanks for your comments, Rich.

--
() ascii ribbon campaign -- against html e-mail
/\ www.asciiribbon.org -- against proprietary attachments

1

rocksolid light 0.9.8
clearnet tor