Search the web
Sign In
New User? Sign Up
baseball-databank · Baseball Databank
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Real people. Real stories. See how Yahoo! Groups impacts members worldwide.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Re: Access and BDB   Message List  
Reply | Forward Message #3166 of 3880 |
Re: [baseball-databank] Re: Big multivariate analysis--questions

Question:

Can the data vectors for each player be used to create a classification
of ball players?

First the number of player-types has to be determined from the data
matrix.

Then once determined, the data vector for each player type is calculated
ad the proportionate contribution of each type to each player is
calculated.

In a prior run for example we determined that Ralph Kiner (!!!) was the
best hitter in post war baseball just a little bit better than Ted
Williams or Joe DiMaggio. Many of the now-notorious current hitters
become suddenly a "new" type of player at some point in their careers.

If we just take batting stats alone each player-type is represented by a
set of hitting statistics. Each player is represented by the
contribution of each type to his hitting stats. We can then graph his
performance over time. Another interesting finding in the last run was
the uniqueness of Willy Mays with respect to triples.

I helped design this software (I'm a kind of statistician) and use it
several times a week for more boring applications in chemistry and
environmental science.
Though we would try another whack at baseball stats for fun.

Tangotiger wrote:

> What is it that you are trying to accomplish? What is
> the question that you are seeking an answer for?
>
> Tom
>
> --- Robert Ehrlich <bobehrlich@...
> <mailto:bobehrlich%40residuumenergy.com>>
> wrote:
>
> >
> > I am about to engage in an analysis of baseball
> > stats. The database
> > that I am using is Lahman-52. will be analyzing
> > 1921 to 2004.
> >
> > Is this a decent choice for the data?
> >
> > In that DB there are columns for singles, doubles,
> > triples, etc. as well
> > as at-bats.
> >
> > The sum of such columns (including various ways to
> > be out) is not the
> > same as the number of at-bats. However the numbers
> > do not include
> > decimal points--so I assume that they represent
> > counts.
> >
> > Should I divide those columns by the at-bats to
> > generate a string of
> > numbers that will total to the batting average?
> > That is, a set of
> > columns that will sum to unity and a subset (hits,
> > etc.) that will sum
> > to the batting average?
> >
> > We are planning to eliminate all records where the
> > number of games were
> > less than 20.
> >
> > Depending on the complexity of the output, we may
> > eliminate pitchers
> > from the input so as not to waste a degree of
> > freedom.
> >
> > Lastly we would like to have the names of potential
> > collaborators in
> > interpreting and writing up this data.
> >
> > our analytical procedure is something called
> > "Polytopic Vector Analysis"
> > (PVA) and we have all ready done a few trial runs
> > with interesting results.
> >
> > Bob Ehrlich
> >
> >
> >
>
> -----------------------------------------------
> THE BOOK -- Playing The Percentages In Baseball
> http://www.InsideTheBook.com <http://www.InsideTheBook.com>
>
> -----------------------------------------------
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com <http://mail.yahoo.com>
>
>




Wed Aug 16, 2006 2:40 am

bobehrlichpva
Offline Offline
Send Email Send Email

Forward
Message #3166 of 3880 |
Expand Messages Author Sort by Date

In response to both responses to my response... I completely agree that Access is blown out of the water - as is Paradox, FileMaker, and every other consumer...
P Mondout
pmondout
Offline Send Email
May 18, 2006
3:30 pm

... Much of what you are asking for here is available in the book "Baseball Hacks" by Joseph Adler, which has already been mentioned. I don't think it ...
John Walsh
walshj58
Offline Send Email
May 18, 2006
3:54 pm

... I am presently writing an article that shows how to get the BDB information into PostgreSQL and load the Retrosheet data into MySQL and PostgreSQL. This...
Mat Kovach
matkovach
Offline Send Email
May 18, 2006
5:14 pm

... I think another option is SQLite[1], which is very fast and perfect for small queries. I did the conversion a couple months back, and have an SQLite...
Ben Matasar
benmatasar
Offline Send Email
May 18, 2006
5:40 pm

Someone else mentioned about using Access as a front-end. This is absolutely true, and yet another great useability feature of Access. It's a snap to make an...
tangotiger
Offline Send Email
May 19, 2006
1:04 am

... If you write 100 queries (and save them), how do you keep them straight? ... Is this like "working" with oracle all the time but "using" a mailreader most...
Paul Wendt
pgw02472
Offline Send Email
May 20, 2006
12:23 am

... Ahhh... Access has a "properties" button, which you can put comments and documentation for each view, which you can then see. That column is also...
tangotiger
Offline Send Email
May 20, 2006
1:46 pm

... I considered that once long ago; maybe I should consider again. Now I have Query names as two "dimensions" of organization because they are both...
Paul Wendt
pgw02472
Offline Send Email
May 24, 2006
3:32 pm

... Just to give you some ideas. When I was working on my project, I tackled many topics and subtopics. So, when I worked on Relievers and Leverage, I have a...
tangotiger
Offline Send Email
May 25, 2006
12:50 am

I am about to engage in an analysis of baseball stats. The database that I am using is Lahman-52. will be analyzing 1921 to 2004. Is this a decent choice for...
Robert Ehrlich
bobehrlichpva
Offline Send Email
Aug 14, 2006
2:57 pm

What is it that you are trying to accomplish? What is the question that you are seeking an answer for? Tom ... THE BOOK -- Playing The Percentages In Baseball...
tangotiger
Offline Send Email
Aug 16, 2006
2:24 am

Question: Can the data vectors for each player be used to create a classification of ball players? First the number of player-types has to be determined from...
Robert Ehrlich
bobehrlichpva
Offline Send Email
Aug 16, 2006
2:43 am

If by classification you mean a profile or style of player, that's a good project. I look forward to seeing what you have. If you want some ideas as to how...
tangotiger
Offline Send Email
Aug 16, 2006
3:01 am

... Of course. ... As Tom Tango suggested, that reveals mixed motives. I believe that it must get in the way of classifying players --which you might do by ...
Paul Wendt
pgw02472
Offline Send Email
Aug 16, 2006
6:01 pm

... I guess that you are inclined to denominate by at bats, ignoring bases on balls and hits by pitch, sacrifice bunts and flies, because popular baseball uses...
Paul Wendt
pgw02472
Offline Send Email
Aug 16, 2006
6:09 pm

Paul: Thanks for the advice--I will include the other stats. The hitting example was a feeble attempt to describe what I am doing. The procedure is robust in...
Robert Ehrlich
bobehrlichpva
Offline Send Email
Aug 16, 2006
10:07 pm

Kristin Campbell <kcampbell53@...> is a colleague, I suppose? ... The particular article by Jim Albert, advocating four rates understood sequentially,...
Paul Wendt
pgw02472
Offline Send Email
Aug 17, 2006
12:29 am

... Voros first described this process when he developed DIPS. He would break up the stat line into binary components: HBP, no HBP. Of the no HBP, walk or no ...
tangotiger
Offline Send Email
Aug 16, 2006
10:33 pm

... Based on what I know of the project, I might have guessed 1939 because in the Batting table I find 26106 records with null GIDP (ground into double play)...
Paul Wendt
pgw02472
Offline Send Email
Aug 17, 2006
1:15 am

I'm an Oracle programmer analyst/DBA. If you'd like to discuss this further feel free to contact me directly. ... From: P Mondout To:...
Dereck L. Dietz
dldietz2001
Online Now Send Email
May 19, 2006
12:32 am

How was Oracle making it complicated to move data from one Oracle database to another? ... From: Tangotiger To: baseball-databank@yahoogroups.com Sent:...
Dereck L. Dietz
dldietz2001
Online Now Send Email
May 19, 2006
4:01 pm
Advanced

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help