Question:
Can the data vectors for each player be used to create a classification
of ball players?
First the number of player-types has to be determined from the data
matrix.
Then once determined, the data vector for each player type is calculated
ad the proportionate contribution of each type to each player is
calculated.
In a prior run for example we determined that Ralph Kiner (!!!) was the
best hitter in post war baseball just a little bit better than Ted
Williams or Joe DiMaggio. Many of the now-notorious current hitters
become suddenly a "new" type of player at some point in their careers.
If we just take batting stats alone each player-type is represented by a
set of hitting statistics. Each player is represented by the
contribution of each type to his hitting stats. We can then graph his
performance over time. Another interesting finding in the last run was
the uniqueness of Willy Mays with respect to triples.
I helped design this software (I'm a kind of statistician) and use it
several times a week for more boring applications in chemistry and
environmental science.
Though we would try another whack at baseball stats for fun.
Tangotiger wrote:
> What is it that you are trying to accomplish? What is
> the question that you are seeking an answer for?
>
> Tom
>
> --- Robert Ehrlich <bobehrlich@...
> <mailto:bobehrlich%40residuumenergy.com>>
> wrote:
>
> >
> > I am about to engage in an analysis of baseball
> > stats. The database
> > that I am using is Lahman-52. will be analyzing
> > 1921 to 2004.
> >
> > Is this a decent choice for the data?
> >
> > In that DB there are columns for singles, doubles,
> > triples, etc. as well
> > as at-bats.
> >
> > The sum of such columns (including various ways to
> > be out) is not the
> > same as the number of at-bats. However the numbers
> > do not include
> > decimal points--so I assume that they represent
> > counts.
> >
> > Should I divide those columns by the at-bats to
> > generate a string of
> > numbers that will total to the batting average?
> > That is, a set of
> > columns that will sum to unity and a subset (hits,
> > etc.) that will sum
> > to the batting average?
> >
> > We are planning to eliminate all records where the
> > number of games were
> > less than 20.
> >
> > Depending on the complexity of the output, we may
> > eliminate pitchers
> > from the input so as not to waste a degree of
> > freedom.
> >
> > Lastly we would like to have the names of potential
> > collaborators in
> > interpreting and writing up this data.
> >
> > our analytical procedure is something called
> > "Polytopic Vector Analysis"
> > (PVA) and we have all ready done a few trial runs
> > with interesting results.
> >
> > Bob Ehrlich
> >
> >
> >
>
> -----------------------------------------------
> THE BOOK -- Playing The Percentages In Baseball
> http://www.InsideTheBook.com <http://www.InsideTheBook.com>
>
> -----------------------------------------------
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com <http://mail.yahoo.com>
>
>