If by classification you mean a profile or style of
player, that's a good project. I look forward to
seeing what you have. If you want some ideas as to
how to make the classification, you can do things
like:
(BB+K)/PA
3b/(2b+3b)
sb/(1b*.8+bb*.6)
They each represent something specific. In the 2006
Hardball Times Annual, they have profiles based on GB,
FB, LD, etc, tendencies. This data is now available
at Fangraphs.com, back to 2002.
However, the "proportionate" contribution is a long
and tired road, long-traveled. I suggest reading what
is out there, rather than reinventing the wheel. I'd
start (and stop) with Linear Weights.
Tom
--- Robert Ehrlich <bobehrlich@...>
wrote:
> Question:
>
> Can the data vectors for each player be used to
> create a classification
> of ball players?
>
> First the number of player-types has to be
> determined from the data
> matrix.
>
> Then once determined, the data vector for each
> player type is calculated
> ad the proportionate contribution of each type to
> each player is
> calculated.
>
> In a prior run for example we determined that Ralph
> Kiner (!!!) was the
> best hitter in post war baseball just a little bit
> better than Ted
> Williams or Joe DiMaggio. Many of the now-notorious
> current hitters
> become suddenly a "new" type of player at some point
> in their careers.
>
> If we just take batting stats alone each player-type
> is represented by a
> set of hitting statistics. Each player is
> represented by the
> contribution of each type to his hitting stats. We
> can then graph his
> performance over time. Another interesting finding
> in the last run was
> the uniqueness of Willy Mays with respect to
> triples.
>
> I helped design this software (I'm a kind of
> statistician) and use it
> several times a week for more boring applications in
> chemistry and
> environmental science.
> Though we would try another whack at baseball stats
> for fun.
>
> Tangotiger wrote:
>
> > What is it that you are trying to accomplish? What
> is
> > the question that you are seeking an answer for?
> >
> > Tom
> >
> > --- Robert Ehrlich <bobehrlich@...
> > <mailto:bobehrlich%40residuumenergy.com>>
> > wrote:
> >
> > >
> > > I am about to engage in an analysis of baseball
> > > stats. The database
> > > that I am using is Lahman-52. will be analyzing
> > > 1921 to 2004.
> > >
> > > Is this a decent choice for the data?
> > >
> > > In that DB there are columns for singles,
> doubles,
> > > triples, etc. as well
> > > as at-bats.
> > >
> > > The sum of such columns (including various ways
> to
> > > be out) is not the
> > > same as the number of at-bats. However the
> numbers
> > > do not include
> > > decimal points--so I assume that they represent
> > > counts.
> > >
> > > Should I divide those columns by the at-bats to
> > > generate a string of
> > > numbers that will total to the batting average?
> > > That is, a set of
> > > columns that will sum to unity and a subset
> (hits,
> > > etc.) that will sum
> > > to the batting average?
> > >
> > > We are planning to eliminate all records where
> the
> > > number of games were
> > > less than 20.
> > >
> > > Depending on the complexity of the output, we
> may
> > > eliminate pitchers
> > > from the input so as not to waste a degree of
> > > freedom.
> > >
> > > Lastly we would like to have the names of
> potential
> > > collaborators in
> > > interpreting and writing up this data.
> > >
> > > our analytical procedure is something called
> > > "Polytopic Vector Analysis"
> > > (PVA) and we have all ready done a few trial
> runs
> > > with interesting results.
> > >
> > > Bob Ehrlich
> > >
> > >
> > >
> >
> > -----------------------------------------------
> > THE BOOK -- Playing The Percentages In Baseball
> > http://www.InsideTheBook.com
> <http://www.InsideTheBook.com>
> >
> > -----------------------------------------------
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam? Yahoo! Mail has the best spam
> protection around
> > http://mail.yahoo.com <http://mail.yahoo.com>
> >
> >
>
>
-----------------------------------------------
THE BOOK -- Playing The Percentages In Baseball
http://www.InsideTheBook.com
-----------------------------------------------
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com