Skip to search.

Breaking News Visit Yahoo! News for the latest.

×Close this window

baseball-databank · Baseball Databank

The Yahoo! Groups Product Blog

Check it out!

Group Information

? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Message search is now enhanced, find messages faster. Take it for a spin.

Messages

Advanced
Messages Help
Messages 3902 - 3931 of 4385   Oldest  |  < Older  |  Newer >  |  Newest
Messages: Show Message Summaries Sort by Date ^  
#3902 From: "Tangotiger" <tom@...>
Date: Mon Feb 1, 2010 8:55 pm
Subject: Errors in XREF_Stats table
tom@...
Send Email Send Email
 
Line 1: Player MLBAM BDB Retro
Line 2: correct STATS, but has incorrect in Xref_Stats

Chris Smith 434672 smithch07 smitc002
8253 not 6378

Jacque Jones 150218 jonesja04 jonej003
6246 not 7191

Tom


---------------------------------------------
The Book--Playing The Percentages In Baseball
http://www.InsideTheBook.com

#3903 From: "Tangotiger" <tom@...>
Date: Fri Feb 5, 2010 8:55 pm
Subject: Re: Errors in XREF_Stats table
tom@...
Send Email Send Email
 
In conjunction with the earlier mention,  jonesja05 (Jason Dewey Jones)
should have a STATS Id of 7191 not 6246.  Basically, the two were swapped.

Tom



> Line 1: Player MLBAM BDB Retro
> Line 2: correct STATS, but has incorrect in Xref_Stats
>
> Chris Smith 434672 smithch07 smitc002
> 8253 not 6378
>
> Jacque Jones 150218 jonesja04 jonej003
> 6246 not 7191
>
> Tom
>
>
> ---------------------------------------------
> The Book--Playing The Percentages In Baseball
> http://www.InsideTheBook.com
>
>


---------------------------------------------
The Book--Playing The Percentages In Baseball
http://www.InsideTheBook.com

#3904 From: "Tangotiger" <tom@...>
Date: Fri Feb 5, 2010 9:59 pm
Subject: The Universal Player ID and Biographical Data project
tom@...
Send Email Send Email
 
I have found a willing partner in MLBAM to provide all their IDs.  I have
taken the first step of cobbling ID maps of various people and sources,
and have created the following:

http://www.insidethebook.com/ee/images/uploads/EXPORT_ID_MAP.zip

More details are here:
http://www.insidethebook.com/ee/index.php/site/comments/the_universal_player_id_\
and_biographical_data_project/

For all of you who have an ID mapping file, PLEASE, download the above,
link it to yours and:
1. fix your files if you find errors
2. tell me about any errors I have
3. after you fix your errors, submit your ID mapping file to me; I can
certify it as accurate, or it might help me in finding more errors

I have so far cobbled data from several sources.  My hope is that after 2
or 3 iterations of this that I will come up with the definitive ID mapping
file.

If someone out there works for STATS, then be a good guy and help me out.
The BIS data looks like it's complete, as is the Retro and BDB.  There is
still some problems with the MLBAM data (a few dozen corrections
required).

Once this is done, we can then link to the biographical information (name,
handedness, birth, death, school), and MLBAM as I said is a willing
partner here.  MLBAM includes not only the 17,000 or so MLB player ID, but
another 65,000 or so minors and others player IDs.

Hopefully, at some point in the near future, we can make it so that we
have overnight updates any time corrections are made.

Help me, help you.

Thanks, Tom

---------------------------------------------
The Book--Playing The Percentages In Baseball
http://www.InsideTheBook.com

#3905 From: "Tangotiger" <tom@...>
Date: Fri Feb 5, 2010 11:17 pm
Subject: re: The Universal Player ID and Biographical Data project
tom@...
Send Email Send Email
 
I forgot to note that these are showing duplicates based on conflicting
data sources and I haven’t tried to resolve them yet:

bis_id mlbam1 mlbam2
18 297292 491703
49 112004 452035
263 110638 445216
936 233594 446248
1137 237796 425532
1479 400123 420944

stats_id mlbam1 mlbam2
6918 400123 420944

Tom

#3906 From: Wells Oliver <wells@...>
Date: Thu Feb 4, 2010 9:55 pm
Subject: Extra base hits for pitchers?
xmutex
Send Email Send Email
 
Am I missing something or is this not included in the Pitching table, nor derivable- there's no doubles nor triples.

--
Wells Oliver
wells@...

#3907 From: "Clem Comly" <ccomly@...>
Date: Tue Feb 9, 2010 4:30 am
Subject: Re: Extra base hits for pitchers?
ccomly2003
Send Email Send Email
 
They are not official statistics.  It is available on Retrosheet for seasons that have 100% play-by-play and for recent seasons where p-b-p are available.
 
Clem Comly
----- Original Message -----
Sent: Thursday, February 04, 2010 4:55 PM
Subject: [baseball-databank] Extra base hits for pitchers?

 

Am I missing something or is this not included in the Pitching table, nor derivable- there's no doubles nor triples.

--
Wells Oliver
wells@submute.net


#3908 From: "mitchell m" <mmc_cann@...>
Date: Tue Feb 9, 2010 5:19 pm
Subject: NULLs vs Zeros again, boring, boring
mmc_cann
Send Email Send Email
 
Paul DuBois, "MySQL Cookbook," recipe 1.2 creates a database of limbs.  Humans
have 2 arms and 2 legs.  Insects get 6 legs and 0 arms.  A phonograph gets 1 arm
and 0 legs.  The only "thing" that gets NULLs for both arms and legs is the
"Space Alien" because of its unknown physiology.
I think that Sack hits and flys were zero rather than NULL.  This opinion holds
for many other of the categories in the data bank tables.

#3909 From: Theodore Turocy <drarbiter@...>
Date: Tue Feb 9, 2010 5:58 pm
Subject: Re: NULLs vs Zeros again, boring, boring
arb1ter
Send Email Send Email
 


On Tue, Feb 9, 2010 at 12:19 PM, mitchell m <mmc_cann@...> wrote:
 

Paul DuBois, "MySQL Cookbook," recipe 1.2 creates a database of limbs. Humans have 2 arms and 2 legs. Insects get 6 legs and 0 arms. A phonograph gets 1 arm and 0 legs. The only "thing" that gets NULLs for both arms and legs is the "Space Alien" because of its unknown physiology.
I think that Sack hits and flys were zero rather than NULL. This opinion holds for many other of the categories in the data bank tables.

There's a basic logical fallacy in this argument.

The example shows that when something is unknown implies it should be represented as a NULL.  It does not follow that anything represented as NULL must be unknown.

Furthermore, just because a category was not reported officially does not mean the events did not occur.  All RBI totals pre-1920 were reconstructed ex-post-facto.  There is a current research project which is reporting batter strikeouts for years in which they were not reported officially. These totals do (in the case of RBIs) and most likely will (in the case of strikeouts) appear in baseball-databank.  

The use of NULL as "we don't know (yet)" is clearly a best practice.  Full stop.

Ted

 


#3910 From: "Tangotiger" <tom@...>
Date: Tue Feb 9, 2010 6:28 pm
Subject: Re: NULLs vs Zeros again, boring, boring
tom@...
Send Email Send Email
 
> Paul DuBois, "MySQL Cookbook," recipe 1.2 creates a database of limbs.
> Humans have 2 arms and 2 legs.  Insects get 6 legs and 0 arms.  A
> phonograph gets 1 arm  and 0 legs.  The only "thing" that gets NULLs for
> both arms and legs is the "Space Alien" because of its unknown physiology.
> I think that Sack hits and flys were zero rather than NULL.  This opinion
> holds for many other of the categories in the data bank tables.
>
>

Ted is correct in his other response.  We want to classify things as:
a - known to not have occurred
b - unknown to have occurred
c - impossible to have occurred

So for a), we represent this as "0".  For c), we represent this as "null".
  There is 100% agreement here.

The other item is where the issue occurs.  If you make b) "0", then how do
you distinguish it from the legitimate "0" in a)?  You can't.  If you make
b) "null", then how do you know if it could possibly exist or not?

Say, I dunno... batter-balks used to exist, and now they don't, and you
have a category called batter-balks.  You set it to null for all the
present-day players, and, what... 0 for the other guys who played in the
batter-balk era but have unknown totals?  No, you make them null.  But,
then how do you distinguish the batter-balk era from the other era?  You
can't from this particular data, but that's why you have some other table
that defines the season range for the data.

So, you can have a complete description of the data by only using 0 when
you know the total actually was 0.

Tom




---------------------------------------------
The Book--Playing The Percentages In Baseball
http://www.InsideTheBook.com

#3911 From: "mitchell m" <mmc_cann@...>
Date: Wed Feb 10, 2010 7:43 pm
Subject: Defaulting NULL vs Zero won't this ever go away?
mmc_cann
Send Email Send Email
 
Tango Tom and other interest baseball data bank users,  Wednesday,

You really believe that the present NULL defaulting plan up is justified and
most correct. I have given up all hope of convincing you that anything else
should be considered.

Perhaps you will consider some other method of making the historical record
consistent.  To that end, I have uploaded, (to the files section of this group,)
a file named Tom.txt.  That file records the results of the same query about
SF's, (sacrifice flys,) to BOTH the present NULL defaulted database "bbdb2," AND
a zero defaulted clone of that same information, "bbdb1."

If you study the results you can see some of the early years have zero SF, (sack
Flys) and some have NULLs.  Shouldn't your strongly held argument apply
consistently to all data displayed for these years?

Notice how years since 1954 are identicle.  Indeed after studing many trends
spanning these years using both methods of defaulting, I see major advantage in
defaulting to zero.  Perhaps I have not seen a disadvantage only because of the
things I study.  So far my observation remains, defaulting to zero produces
better trend analysis.  But there I go again trying to win you over!

I know that this topic smells of Formaldehyde to most Data-bank users.  But
since the data-bank is largish and is comprehensive, it is useful for students
of certain arcane statistical processes.  My hope is to make useful analysis of
long term trends.  So I have and will make available either the code to modify
the current database, or a derived database for those who might request it.

Bueno Venturs, we have come to a parting of the ways.
Cactus Mitch,

#3912 From: Sean Forman <sean-forman@...>
Date: Wed Feb 10, 2010 7:52 pm
Subject: Re: Defaulting NULL vs Zero won't this ever go away?
sforman71
Send Email Send Email
 
Mitchell,

Replacing nulls with zero on your end is trivial.  Using excel, do a find replace.  Using linux, write a perl or sed script to replace.  Using a database program, use the ifnull or coalesce commands.  If you are using the data for trend analysis, surely your software is sophisticated enough to handle missing data points?

sean
---
Sean Forman
Sports Reference LLC, President
http://www.sports-reference.com/


On Wed, Feb 10, 2010 at 2:43 PM, mitchell m <mmc_cann@...> wrote:
 

Tango Tom and other interest baseball data bank users, Wednesday,

You really believe that the present NULL defaulting plan up is justified and most correct. I have given up all hope of convincing you that anything else should be considered.

Perhaps you will consider some other method of making the historical record consistent. To that end, I have uploaded, (to the files section of this group,) a file named Tom.txt. That file records the results of the same query about SF's, (sacrifice flys,) to BOTH the present NULL defaulted database "bbdb2," AND a zero defaulted clone of that same information, "bbdb1."

If you study the results you can see some of the early years have zero SF, (sack Flys) and some have NULLs. Shouldn't your strongly held argument apply consistently to all data displayed for these years?

Notice how years since 1954 are identicle. Indeed after studing many trends spanning these years using both methods of defaulting, I see major advantage in defaulting to zero. Perhaps I have not seen a disadvantage only because of the things I study. So far my observation remains, defaulting to zero produces better trend analysis. But there I go again trying to win you over!

I know that this topic smells of Formaldehyde to most Data-bank users. But since the data-bank is largish and is comprehensive, it is useful for students of certain arcane statistical processes. My hope is to make useful analysis of long term trends. So I have and will make available either the code to modify the current database, or a derived database for those who might request it.

Bueno Venturs, we have come to a parting of the ways.
Cactus Mitch,



#3913 From: Tangotiger <tangotiger@...>
Date: Wed Feb 10, 2010 9:26 pm
Subject: Re: Defaulting NULL vs Zero won't this ever go away?
tangotiger
Send Email Send Email
 
IIRC, SF came in and out of existence.  If the BDB is showing 0 where it should be null based on the definition, then that should be corrected.

A database is supposed to simply represent data.  "0" can only mean one thing: that the number of times an event occurred was zero.  "null" can mean more than 1 thing.  0 cannot be treated differently from 1, 2, or 300.

Tom
 
---------------------------------------------
Tim Raines, Hall of Fame 2008
http://www.raines30.com/













---------------------------------------------



From: Sean Forman <sean-forman@...>
To: baseball-databank@yahoogroups.com
Sent: Wed, February 10, 2010 2:52:23 PM
Subject: Re: [baseball-databank] Defaulting NULL vs Zero won't this ever go away?

 

Mitchell,


Replacing nulls with zero on your end is trivial.  Using excel, do a find replace.  Using linux, write a perl or sed script to replace.  Using a database program, use the ifnull or coalesce commands.  If you are using the data for trend analysis, surely your software is sophisticated enough to handle missing data points?

sean
---
Sean Forman
Sports Reference LLC, President
http://www.sports- reference. com/


On Wed, Feb 10, 2010 at 2:43 PM, mitchell m <mmc_cann@yahoo. com> wrote:
 

Tango Tom and other interest baseball data bank users, Wednesday,

You really believe that the present NULL defaulting plan up is justified and most correct. I have given up all hope of convincing you that anything else should be considered.

Perhaps you will consider some other method of making the historical record consistent. To that end, I have uploaded, (to the files section of this group,) a file named Tom.txt. That file records the results of the same query about SF's, (sacrifice flys,) to BOTH the present NULL defaulted database "bbdb2," AND a zero defaulted clone of that same information, "bbdb1."

If you study the results you can see some of the early years have zero SF, (sack Flys) and some have NULLs. Shouldn't your strongly held argument apply consistently to all data displayed for these years?

Notice how years since 1954 are identicle. Indeed after studing many trends spanning these years using both methods of defaulting, I see major advantage in defaulting to zero. Perhaps I have not seen a disadvantage only because of the things I study. So far my observation remains, defaulting to zero produces better trend analysis. But there I go again trying to win you over!

I know that this topic smells of Formaldehyde to most Data-bank users. But since the data-bank is largish and is comprehensive, it is useful for students of certain arcane statistical processes. My hope is to make useful analysis of long term trends. So I have and will make available either the code to modify the current database, or a derived database for those who might request it.

Bueno Venturs, we have come to a parting of the ways.
Cactus Mitch,




#3914 From: Tangotiger <tangotiger@...>
Date: Wed Feb 10, 2010 9:29 pm
Subject: Re: Defaulting NULL vs Zero won't this ever go away?
tangotiger
Send Email Send Email
 
By the way, you COULD use "-1" to represent one of the two conditions I noted in my first email, with null taking the other.  This though is going to be extra effort on the end user in terms of making sure to ignore the value.

Tom
 
---------------------------------------------
Tim Raines, Hall of Fame 2008
http://www.raines30.com/













---------------------------------------------



From: Tangotiger <tangotiger@...>
To: baseball-databank@yahoogroups.com
Sent: Wed, February 10, 2010 4:26:22 PM
Subject: Re: [baseball-databank] Defaulting NULL vs Zero won't this ever go away?

 

IIRC, SF came in and out of existence.  If the BDB is showing 0 where it should be null based on the definition, then that should be corrected.

A database is supposed to simply represent data.  "0" can only mean one thing: that the number of times an event occurred was zero.  "null" can mean more than 1 thing.  0 cannot be treated differently from 1, 2, or 300.

Tom
 
------------ --------- --------- --------- ------
Tim Raines, Hall of Fame 2008
http://www.raines30 .com/













------------ --------- --------- --------- ------



From: Sean Forman <sean-forman@ baseball- reference. com>
To: baseball-databank@ yahoogroups. com
Sent: Wed, February 10, 2010 2:52:23 PM
Subject: Re: [baseball-databank] Defaulting NULL vs Zero won't this ever go away?

 

Mitchell,


Replacing nulls with zero on your end is trivial.  Using excel, do a find replace.  Using linux, write a perl or sed script to replace.  Using a database program, use the ifnull or coalesce commands.  If you are using the data for trend analysis, surely your software is sophisticated enough to handle missing data points?

sean
---
Sean Forman
Sports Reference LLC, President
http://www.sports- reference. com/


On Wed, Feb 10, 2010 at 2:43 PM, mitchell m <mmc_cann@yahoo. com> wrote:
 

Tango Tom and other interest baseball data bank users, Wednesday,

You really believe that the present NULL defaulting plan up is justified and most correct. I have given up all hope of convincing you that anything else should be considered.

Perhaps you will consider some other method of making the historical record consistent. To that end, I have uploaded, (to the files section of this group,) a file named Tom.txt. That file records the results of the same query about SF's, (sacrifice flys,) to BOTH the present NULL defaulted database "bbdb2," AND a zero defaulted clone of that same information, "bbdb1."

If you study the results you can see some of the early years have zero SF, (sack Flys) and some have NULLs. Shouldn't your strongly held argument apply consistently to all data displayed for these years?

Notice how years since 1954 are identicle. Indeed after studing many trends spanning these years using both methods of defaulting, I see major advantage in defaulting to zero. Perhaps I have not seen a disadvantage only because of the things I study. So far my observation remains, defaulting to zero produces better trend analysis. But there I go again trying to win you over!

I know that this topic smells of Formaldehyde to most Data-bank users. But since the data-bank is largish and is comprehensive, it is useful for students of certain arcane statistical processes. My hope is to make useful analysis of long term trends. So I have and will make available either the code to modify the current database, or a derived database for those who might request it.

Bueno Venturs, we have come to a parting of the ways.
Cactus Mitch,





#3915 From: Rod Nelson <rodericnelson@...>
Date: Wed Feb 10, 2010 10:54 pm
Subject: Annual Book of Record
rockymtnsabr
Send Email Send Email
 
If the concept of baseball data in book form appeals to you, you'll want to download The Emerald Guide to Baseball - 2010 in PDF format from http://www.sabr.org

Comprehensive batting/pitching/fielding data for 2009 MLB and MiLB and a whole lot more..  Edited by Gillette, Palmer, Turocy and myself.

Thanks~

--
Rod Nelson, Managing Editor
The Emerald Guide to Baseball 2010
http://www.emeraldsportsguides.com

Download it Free!  http://www.sabr.org
Buy it in Hardcopy http://www.lulu.com/sabr

#3916 From: "Alberto Perdomo" <albertop@...>
Date: Thu Feb 11, 2010 1:24 pm
Subject: Re: Annual Book of Record
albertop69
Send Email Send Email
 
Rod, 
 
Thanks for this wonderful guide!
 
There is a minor error in the table of contents (page vi) ...  Florida State League is shown as Class A instead of Advanced Class A.
 
Regards,
 
Alberto.

From: Rod Nelson
Sent: Wednesday, February 10, 2010 6:54 PM
Subject: [baseball-databank] Annual Book of Record

 

If the concept of baseball data in book form appeals to you, you'll want to download The Emerald Guide to Baseball - 2010 in PDF format from http://www.sabr.org

Comprehensive batting/pitching/fielding data for 2009 MLB and MiLB and a whole lot more..  Edited by Gillette, Palmer, Turocy and myself.

Thanks~

--
Rod Nelson, Managing Editor
The Emerald Guide to Baseball 2010
http://www.emeraldsportsguides.com

Download it Free!  http://www.sabr.org
Buy it in Hardcopy http://www.lulu.com/sabr


#3917 From: "mitchell m" <mmc_cann@...>
Date: Sun Feb 14, 2010 6:38 am
Subject: end user ease, unbiased conclusions....
mmc_cann
Send Email Send Email
 
Sean, Tom,

Yes it is easy to change NULLs to zeds.  I've got lots of practice.  Then I
decided that it would be easier to alter the defaults.  Then I noticed that in
some cases the groups selected were different.  MySQL handles NULLs and ZEDs
differently when selecting or displaying results based on computations.

I have been trying to recreate Gould's conclusion that baseball is getting
better and that the demise of the 400 hitter is a result of that improvement, I
noticed a quantity effect.  Increasing the number of at bats decreases the
probability of maintaining a high batting average.

Gould had a computer and a programmer available though he did much of
computational work in his hospital be.  He commented on baseball statistics
medicinal value.  Never the less he used a high low-end cut- off of about 100 at
bats, (if I remember correctly.)  That seemed worth study to me.  Early on there
were many fewer games and therefor fewer at bats.   In a drug effectiveness
study one should not de-select subjects as inappropriate if they show adverse
reaction tendencies and just say they were not in the experimental group.

Why is the post-season batting table Home Run column defaulted to non-NULL and
zero in the present offering?  The way I suggest is in the data base already,
albeit in only one column that I have found so far.

Some might point out that we are dealing only with historical records.  
Devotion to baseball stats is not for everyone.

Catchers and pitcher report soon. cactusmitch

#3918 From: Bryan Walko <bryanwalko@...>
Date: Sun Feb 14, 2010 3:51 am
Subject: KBO Database
bryanwalko
Send Email Send Email
 
For anyone who's an international baseball geek, I noticed that outside of the US, the only decent historical stats were for the Japanese League. The Korean Baseball Organization (KBO) has been around for nearly 30 years now, and has been incredibly stable (having only one defunct franchise in 30 years).

It rates as the probably the third best league in the world (MLB/NPB/KBO), although the Mexican League is pretty decent as well (and is technically part of the NAPBL). Basically KBO is the last league that you have in the world before you get to the more fly-by-night and/or lesser leagues (China, Netherlands, Italy, Cuba, Taiwan, Australia).

To the point, I found some good stores of data, and I started creating a KBO historical database. I completed my first one (1982-2008) last year and updated it this year for the 2009 season and I plan on continuing to maintain it. You can see it here, if you have any interest. I asked Michael Westbay of JapanBaseball.com to host it, since I figured it's probably the best English language hangout for Asian baseball fans: http://japanesebaseball.com/data/index.gsp

#3919 From: Michael Westbay <westbaystars@...>
Date: Tue Feb 16, 2010 12:39 am
Subject: Re: KBO Database
westbaystars
Send Email Send Email
 
Walko-san wrote:

> I asked Michael Westbay of JapanBaseball.com to host it, since I figured it's
probably the best English language hangout for Asian baseball fans:
http://japanesebaseball.com/data/index.gsp

The original data set is at the above location.  The links to the 2009
data can be found here:

http://www.japanesebaseball.com/forum/thread.gsp?forum=3&thread=30250#377608

Hope this helps.

--
Michael Westbay
Writer/System Administrator
http://www.japanesebaseball.com/

#3920 From: "Tangotiger" <tom@...>
Date: Tue Feb 16, 2010 4:09 am
Subject: Re: end user ease, unbiased conclusions....
tom@...
Send Email Send Email
 
> Why is the post-season batting table Home Run column defaulted to non-NULL
> and zero in the present offering?  The way I suggest is in the data base
> already, albeit in only one column that I have found so far.

If you followed my rules, then you would have to conclude that there is
never an occasion in playoff baseball that someone hit a HR and we didn' t
know about it.  Hence, nulls are impossible.

It seems to me that taking a stance that HR is non-nullable in playoff
history is perfectly fine.

Tom

#3921 From: "mitchell m" <mmc_cann@...>
Date: Tue Feb 16, 2010 4:19 am
Subject: See the graphs and database in other part of this group.
mmc_cann
Send Email Send Email
 
I'm still learning how to use the up load features of other parts of this
group....
On the Open Office spreadsheet it is pretty easy to change the display number of
decimal point accuracy, but that didn't seem to come all the way through to the
Yahoo database file.  The graph, (in the Photos section, (I used R,)) kept the
more accurate display easily for the study using Zedded entries.

I still think that is best to convert the NULLs to ZEDs in most cases.

Ab Bats per Plate Appearance or Batting Opportunity as some call it is
declining.  The current scoring rules advantage pitchers over hitters.  I like
watching low scoring pitching duels.  But is it "Better Baseball," and is it as
exciting for kids?  I don't even talk with my grand kids about such like.  My
grand daughter likes rugby!

There is no FAIR comparison of todays batters with those of old.  Sad but true.

#3922 From: "wydiyd" <wydiyd@...>
Date: Mon Feb 22, 2010 6:31 pm
Subject: Colleges
wydiyd
Send Email Send Email
 
I just ran a query on all the colleges that players attended.  The numbers are
lower than the ones from B-ref.

In a couple of case, the player attended two universities, but in most cases the
college is just missing.

Is there a reason for this discrepancy?

Gabe Molina is an example.

#3923 From: Wells Oliver <wells@...>
Date: Tue Feb 23, 2010 6:33 pm
Subject: MLBAM IDs- a question that's been asked, I know..
xmutex
Send Email Send Email
 
Sorry, is there a definitive place to go (or an almost definitive place to go) for a csv/mysql export of MLBAM IDs mapped to retro/bbref IDs?

I'll make a bookmark this time, I swear. Thanks.

--
Wells Oliver
wells@...

#3924 From: "Tangotiger" <tom@...>
Date: Tue Feb 23, 2010 8:57 pm
Subject: Re: MLBAM IDs- a question that's been asked, I know..
tom@...
Send Email Send Email
 
The definitive will be in this thread:

www.insidethebook.com/ee/index.php/site/comments/the_universal_player_id_and_bio\
graphical_data_project/

I'm still working on it.

Tom

> Sorry, is there a definitive place to go (or an almost definitive place to
> go) for a csv/mysql export of MLBAM IDs mapped to retro/bbref IDs?
>
> I'll make a bookmark this time, I swear. Thanks.
>
> --
> Wells Oliver
> wells@...
>


---------------------------------------------
The Book--Playing The Percentages In Baseball
http://www.InsideTheBook.com

#3925 From: Wells Oliver <wells@...>
Date: Mon Mar 1, 2010 7:57 pm
Subject: Question about Gameday data: pitch types
xmutex
Send Email Send Email
 
Hey everyone. I'm working through some little things and I'm wondering if I am just missing something obvious: what I'd like to do is use the Gameday data to determine a given lineup's skill against pitch types. So I'm looking at the probable starter for a given game, and getting his pitch types like so:

SELECT
pitcher player_id,
count(*) c,
pitch_type
FROM gameday.pitch
WHERE pitcher IN (%s)
AND pitch_type IS NOT NULL
GROUP BY pitch_type
ORDER BY c DESC

This gives you a breakdown of # of pitch type thrown (as per Gameday determinations). However, it seems like Gameday is lacking the info to really figure out how well a batter does against a certain kind of pitch. You can count from the same pitch table where the batter is a certain ID and group by the 'des' column to see on what pitch types the hitter made a hit, but it seems... incomplete.

Any tips here?

--
Wells Oliver
wells@...

#3926 From: Wells Oliver <wells@...>
Date: Thu Mar 4, 2010 8:14 pm
Subject: Re: Gameday data gone?
xmutex
Send Email Send Email
 
ARGH, nevermind- was using 'content' and not 'component'. Apologies to one and all.

On Thu, Mar 4, 2010 at 1:15 PM, Wells Oliver <wells@...> wrote:
Anyone know if MLB.com is still going to provide the standard XML data for Gameday this year? I know it's just spring training, and so no pitchfx data, but even the standard XML files containing play by play, etc, don't appear there:

http://gd2.mlb.com/content/game/mlb/year_2010/month_03/

--
Wells Oliver
wells@...



--
Wells Oliver
wells@...

#3927 From: Sean Forman <sean-forman@...>
Date: Thu Mar 4, 2010 8:28 pm
Subject: Re: Re: Gameday data gone?
sforman71
Send Email Send Email
 


sean
---
Sean Forman
Sports Reference LLC, President
http://www.sports-reference.com/


On Thu, Mar 4, 2010 at 3:14 PM, Wells Oliver <wells@...> wrote:
 

ARGH, nevermind- was using 'content' and not 'component'. Apologies to one and all.

On Thu, Mar 4, 2010 at 1:15 PM, Wells Oliver <wells@...> wrote:
Anyone know if MLB.com is still going to provide the standard XML data for Gameday this year? I know it's just spring training, and so no pitchfx data, but even the standard XML files containing play by play, etc, don't appear there:

http://gd2.mlb.com/content/game/mlb/year_2010/month_03/

--
Wells Oliver
wells@...



--
Wells Oliver
wells@...



#3928 From: Wells Oliver <wells@...>
Date: Thu Mar 4, 2010 7:15 pm
Subject: Gameday data gone?
xmutex
Send Email Send Email
 
Anyone know if MLB.com is still going to provide the standard XML data for Gameday this year? I know it's just spring training, and so no pitchfx data, but even the standard XML files containing play by play, etc, don't appear there:

http://gd2.mlb.com/content/game/mlb/year_2010/month_03/

--
Wells Oliver
wells@...

#3929 From: Paul Golba <pgolba2@...>
Date: Thu Mar 11, 2010 3:20 am
Subject: Errata in the 1998 NL
pgolba2
Send Email Send Email
 
First email as part of the group.  Apologies if I am violating any protocols.

I've been trying to use the Baseball Databank (BDB) database to compute Win
Shares.  I've been following the book's step-by-step example with the 1998 St.
Louis Cardinals and I keep running into small discrepancies between the book and
the BDB.  Most of the time the book matches Baseball-Reference (B-R) and/or
Retrosheet.

Issues that I have encountered so far (all involved the 1998 NL):
1. Roberto Petagine (CIN) has 2 GIDP in BDB.  B-R has 1.  Retrosheet has him
hitting into two double plays, but one was a line drive type.
2. There are three pitcher HBP in the NL missing.  According to B-R John Thomson
(COL) has 2 and Gabe Gonzalez (FLO) has 1.  BDB has 0 for both.
3. The league fielding putouts (69710) to not match the total of the league
IPOuts (69719).
4. The league IPOuts (69719) do not match B-R (69720).  I haven't tracked down
where that stray out is yet.
5. For the entire NL, BDB has 837 WP, B-R has 835.  Again, I didn't track down
particular players yet.
6. The pitching runs allowed (not earned runs, just plain old runs) is off
significantly.  BDB has 11918, B-R 11943.  I did track down a couple of players
with Florida that have discrepancies.  Antonio Alfoseca has has 32 in BDB
(matching his ER) but 36 in B-R.  Felix Heredia has 25 in BDB (also matching his
ER) but 30 in B-R.  There are more discrepancies than those two though.

There are also issues with fielding assists, but that does not concern me as
much as the other stuff.

Is this the right place to report these issues?  I am willing to help track down
discrepancies, upon request.

Paul Golba

#3930 From: "Tangotiger" <tom@...>
Date: Thu Mar 11, 2010 2:18 pm
Subject: Re: Errata in the 1998 NL
tom@...
Send Email Send Email
 
> Is this the right place to report these issues?  I am willing to help
> track down discrepancies, upon request.
>
> Paul Golba
>

This is absolutely the right place.  And while you didn't break any
protocols (if we even have any), from where I sit you can break whatever
protocols you need to, to report data issues like this.

Truth > whatever

Tom

#3931 From: Wells Oliver <wells@...>
Date: Thu Mar 18, 2010 4:05 pm
Subject: Useful list of players?
xmutex
Send Email Send Email
 
So a quick question here- I'm working on my little projection system and I'm doing a lot of hacky crap to put together the "list of MLB players" to project. Right now I'm using the entire 40 man rosters of every team as provided by mlb.com. I have to use Tango's list of IDs - which is awesome, but still under development - to match each MLB player up with their BDB ID, and excluding any guy w/o a BDB ID brings the rosters down by 10ish.

I need the BDB ID as I'm using the BDB's stats for projections.

Is there some cleaner way of doing this? Some better "list of players" from which to work?

Any tips appreciated. Thanks!

--
Wells Oliver
wells@...

Messages 3902 - 3931 of 4385   Oldest  |  < Older  |  Newer >  |  Newest
Add to My Yahoo!      XML What's This?

Copyright © 2010 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines NEW - Help