What is it that you are trying to accomplish? What is
the question that you are seeking an answer for?
Tom
--- Robert Ehrlich <bobehrlich@...>
wrote:
>
> I am about to engage in an analysis of baseball
> stats. The database
> that I am using is Lahman-52. will be analyzing
> 1921 to 2004.
>
> Is this a decent choice for the data?
>
> In that DB there are columns for singles, doubles,
> triples, etc. as well
> as at-bats.
>
> The sum of such columns (including various ways to
> be out) is not the
> same as the number of at-bats. However the numbers
> do not include
> decimal points--so I assume that they represent
> counts.
>
> Should I divide those columns by the at-bats to
> generate a string of
> numbers that will total to the batting average?
> That is, a set of
> columns that will sum to unity and a subset (hits,
> etc.) that will sum
> to the batting average?
>
> We are planning to eliminate all records where the
> number of games were
> less than 20.
>
> Depending on the complexity of the output, we may
> eliminate pitchers
> from the input so as not to waste a degree of
> freedom.
>
> Lastly we would like to have the names of potential
> collaborators in
> interpreting and writing up this data.
>
> our analytical procedure is something called
> "Polytopic Vector Analysis"
> (PVA) and we have all ready done a few trial runs
> with interesting results.
>
> Bob Ehrlich
>
>
>
-----------------------------------------------
THE BOOK -- Playing The Percentages In Baseball
http://www.InsideTheBook.com
-----------------------------------------------
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com