Skip to search.

Breaking News Visit Yahoo! News for the latest.

×Close this window

baseball-databank · Baseball Databank

The Yahoo! Groups Product Blog

Check it out!

Group Information

? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Message search is now enhanced, find messages faster. Take it for a spin.

Messages

Advanced
Messages Help
Messages 2647 - 2676 of 4385   Oldest  |  < Older  |  Newer >  |  Newest
Messages: Show Message Summaries Sort by Date ^  
#2647 From: Paul Wendt <pgw02472@...>
Date: Mon May 2, 2005 1:51 am
Subject: nulls - what has been checked?
pgw02472
Send Email Send Email
 
Dave Kent, thanks for answering my question about the "null to zero"
version of addition.

Re the particular example, plate appearances, it occurred to me that we
may have all the data necessary to calculate rather than estimate PA,
except for interference and obstruction.  That is, the null values for PA
components AB(none), BB(none), HBP, SH, SF may represent not missing data
but league-seasons in which the scoring category did not exist.

In MS Access lahman52, I looked at the records where HBP is null.  I found
the subset for seasons before the hit batsman rule, ten 2004 records, and
one 1916 record --the first of three 1916 stints for Earl Hamilton,
hamilea01
(That stint is not listed on the baseball-reference webpage.)

I don't know whether these stray nulls have been corrected in the databank.

The general question is, have we thoroughly checked the incidence of nulls
in the databank?

Paul Wendt

#2648 From: Paul Wendt <pgw02472@...>
Date: Mon May 2, 2005 2:58 am
Subject: returns to team - Pete Palmer's list
pgw02472
Send Email Send Email
 
On Sun, 1 May 2005, Paul Wendt wrote:

> In MS Access lahman52, I looked at the records where HBP is null.  I found
> the subset for seasons before the hit batsman rule, ten 2004 records, and one
> 1916 record --the first of three 1916 stints for Earl Hamilton,
> hamilea01
> (That stint is not listed on the baseball-reference webpage.)

Pete Palmer's list of 47 returns to team is available on the web.
      http://world.std.com/~pgw/19c/return.to.team.html

I covered it in a 19c Cmte newsletter, maybe didn't mention it here.
It provides another point of reference for checking the batting, pitching
and fielding tables (although some stints may correctly be missing from
some tables).

Within MS Access lahman52, I checked returns to team in the Batting and
Fielding tables.  The "returns" by Darren Holmes 2000 and Ed Sprague 2000
are missing from that MS Access ed. and from baseball-reference.

Those happen to be the last two returns on the Palmer list linked above
--although that will change when I add two more recent instances.
playerID yearID teamID NPOS
clarkje02 2003 TEX 2
huckake01 2004 TEX 1

Paul Wendt

#2649 From: "Paul Wendt" <pgw02472@...>
Date: Mon May 2, 2005 6:15 pm
Subject: Re: returns to team - Pete Palmer's list
pgw02472
Send Email Send Email
 
> Pete Palmer's list of 47 returns to team is available on the web.
>      http://world.std.com/~pgw/19c/return.to.team.html
>
> I covered it in a 19c Cmte newsletter, maybe didn't mention it here.
> It provides another point of reference for checking the batting,
> pitching and fielding tables (although some stints may correctly be
> missing from some tables).

I have added Jermaine Clark 2003 and Ken Huckaby 2004, so that is now
a list of 49 within-season returns to team.

The "returns" by Darren Holmes 2000 and Ed Sprague 2000 are missing
from MS Access lahman52 and from baseball-reference.  I haven't
checked the current databank.

Early Hamilton 1916 is a puzzle to me.
4 ip in one game pitched.
No evident plate appearance.
That implies batting out of order, I think.

Paul Wendt

#2650 From: "KJOK" <kjokbaseball@...>
Date: Thu May 5, 2005 12:20 am
Subject: Projects in work (was Re: Introduction)
kjokbaseball
Send Email Send Email
 
Since the list I posted below in response to Gary's original question
is somewhat out of date, maybe a better way to answer the question is
to list projects that each one of us IS working on?

Here's my quick list:

1. Negro Leagues - Will VERY shortly have the LEAUGE and TEAM Table
completed, plus the PARKS related tables.  After that, will have a
beginning MASTER file plus possibly some 1928 actual batting,
pitching and fielding data...

2.  LEAGUES super/master table - This turned out to be a huge
project.  Am about 50% completed, but have been waiting for the new
Minor Leagues Encyclopedia to come out before continuing (anyone have
an update on when it will be out?)

3.  TEAMS super/master table - Will follow after the LEAGUES table is
completed.

4. SALARIES Table for pre-1985 - Have made good progress on 19th
century salaries, but still huge gaps from deadball era to 1984.  Am
hopefull that SABR may be releasing some additional data soon...

5. INJURIES DATABASE - Focusing on years before the disabled list
came to be routinely used (pre-1960).  Project just getting off the
ground.

6. JAPANESE DATA - Deferring to Michael Westbay.  I do have some
historical data, primarily on players who played in both MLB and NPB,
that I need to finish putting in table form.

THANKS,
Kevin


--- In baseball-databank@yahoogroups.com, "KJOK" <kjokbaseball@y...>
wrote:
> Welcome Gary:
>
> This list needs updating, but I believe it's the last one we have
> regarding "what needs to be done" (numbers are previous message
numbers
> in this group):
>
> ==========================
> > PRIORITY 1 - Handle ASAP
> > Type:
> > (1) any item that contains errors in data
> > (2) data that existed, and is now missing
> > ==========================
> > HOF ERRORS
> > 1235
> > 1237
> > 1238
> > (Sean Forman handling.)
> >
> > BATTING ERRORS
> > 1233
> > (KJOK posted corrections.)
> >
> > PITCHING ERRORS
> > 1216-1220
> > 1232
> > (KJOK posted corrections.)
> >
> > STINT ID
> > 1215
> > 1227-1228
> > 1230-1231
> >
> > PLAYER/MANAGER ERRORS
> > 1212
> >
> > DATA ERRORS
> > 1129
> > 1144
> > 1186
> > 1191-1194
> >
> > NEW FILES - errors
> > 1091
> >
> > Positions - Existence/Loss
> > 988 - 990
> > (We used to have some PR/PH data. We don't anymore.)
> >
> > ==========================
> > PRIORITY 2 - Handle Very Very Soon
> > Type:
> > (1) Organizational, procedural items for handling BDB
> > ==========================
> > DISTINGUISH BETWEEN OFFICIAL/UNOFFICIAL ERRORS
> > 1241
> > (Some are known accepted errors.)
> >
> > POW-WOW / organizational direction
> > 1112-1113
> > 1170
> > 1202
> > (CVS, roles/responsibilities. No movement.)
> >
> > SCRIPTS / FILE STRUCTURE
> > 1092-1093
> > 1095-1097
> > 1101
> > 1109
> > 1111 (proposal/accepted)
> > 1173 (proposal)
> > (Derek to write some of the scripts.)
> >
> > FAQ - SYNTAX RULES - field delimiters
> > 987
> > 1032-1037
> > (To be used for above.)
> >
> > Access - 97/2000/XP
> > many (macro only confirmed to work with Access 2000)
> > 1079
> > 1084-1085
> > (Tom needs to create the latest schema.ini file, and give
direction
> > as to producing it programmatically. Scripts may need to be
> written.)
> >
> > ==========================
> > PRIORITY 3 - Handle Very Soon
> > Type:
> > (1) Primary Tables
> > ==========================
> > EXTRA DATA
> > 1110
> > 1114
> > 1142-1143
> > 1208
> > (Should be incorporated. Status unknown. Is Derek's XREF for
> > Retro/Player ID updated?)
> >
> >
> > ==========================
> > PRIORITY 4 - Handle Soon
> > Type:
> > (1) Design Issues, normalization, keys
> > (2) XREF to other databases
> > (3) Standards
> > ==========================
> > SIZE OF FIELDS
> > 1213
> > (Try to imrpove sizing of DB.)
> >
> > STINT ID
> > 1196
> > (Use is inconsistent or wrong.)
> >
> > PARKS
> > 1117-1120
> > 1161
> > 1181-1182
> > 1223
> > (Tom/KJOK updated and uploaded. The BDB/TEAMS table should be
> updated
> > so that it references the new PARK ID, and not the Park NAME.)
> >
> > TEAMS / FRANCHISES - XREF to retrosheet
> > 1145-1147
> > 1174-1175
> > 1183-1184
> >
> > http://groups.yahoo.com/group/RetroList/message/1904
> > 1907-1911
> > (Paul, KJOK reviewing. Discussions to follow. Tom to upload mid-
> > Jan.)
> >
> >
> > DESIGN - leagues - create table
> > 1055-1062
> > 1068-1075
> > 1077-1081
> > (Paul, KJOK, Tom should propose something.)
> >
> > DESIGN - normalization - teams/franchises
> > 1042
> > 1045
> > 1049
> > 1051-1053
> > (Should be implemented, unless more discussions required.)
> >
> > DESIGN - keys - manager table
> > 1043
> > (Should be implemented. This is in ERROR.)
> >
> > DESIGN - keys - lg ID
> > 1044
> > 1048
> > (Should be implemented. This is in ERROR.)
> >
> > Place Names - ISO standards
> > 940
> > 950
> > 952
> > 956
> > 960
> > 967-969
> > (Proposal/resolution required.)
> >
> >
> > ==========================
> > PRIORITY 5 - Handle At Some point
> > Type:
> > (1) Cool add-ons
> > (2) Secondary (tertiary?) data/tables
> > ==========================
> > POSITIONS
> > 1178
> > 1239-1240
> > (Need someone to program the determination of primary positions.
> Tom
> > has uploaded complex query, but it could be improved.)
> >
> > Umpires/HOF/Coaches/Executives/no-hitters
> > 876
> > 879
> > 881
> > 883-886
> > 888-889
> > 891-898
> > 1133
> > 1136-1139
> > 1185
> > 1199-1201
> > (Mike Crain: Umps working on. HOF completed and integrated.
> Coaches
> > near completion. Executives ongoing.)
> > (Derek: working on no-hitters from Mike.)
> >
> > END-USER INTERFACES
> > 1203-1206
> > 1209-1210
> > 1214
> > (Ongoing work by a few individuals.)
> >
> >
> > 19th-century transactions
> > 1224
> > (Mike uploaded.)
> >
> > ==========================
> > PRIORITY 6 - Unknown
> > ==========================
> > INSIDE THE PARK HR
> > 1163
> > (On hold.)
> >
> > MASTER - DOB
> > 1054
> > (No one has stepped forward to handle. Not critical.)
> >
> > Negro League data, Japanese League data, Minor League data,
> Executives
> > (No note. No one has stepped forward to handle. Not critical.)
>
>
> --- In baseball-databank@yahoogroups.com, "Gary" <gcohen33@y...>
wrote:
> >
> > Hello,
> >
> > I'd just like to introduce myself to this group and offer my
time.  I
> > am a lifelong baseball fan, SQL developer, and stat geek.  I've
been
> > known to spend hours on baseballreference.com just for the fun of
> it.
> > I have just recently learned of the existence of this group and
would
> > love to contribute in any way, whether it be loading data,
> > programming, quality control or otherwise.
> >
> > If anyone could give me some direction re: what is needed and
what
> > would be helpful, I would love to hear it.
> >
> > Thanks,
> >
> > Gary Cohen

#2651 From: "Mark" <mramsey81883@...>
Date: Sun May 8, 2005 2:49 pm
Subject: New Member Introduction
mramsey81883
Send Email Send Email
 
Hello,

First of all let me say I am actually so surprised to actually see
what appears to be dozens by first glance (maybe even hundreds) of
true baseball fans and followers not to mention historians of the
game in this dedicated group that I could not resist the honor of
introducing myself, Mark Ramsey, 21, Newport, Tennessee.

I stumbled across this group while searching for an easy accurate way
to update my rosters for MVP Baseball 2005 on a daily basis......and
so far I would say that this group is more than I ever imagined
finding.....I am looking forward to checking things out and will be
more than glad to help out however I can when possible.


Thanks for allowing me into the group and its nice to be among people
who share the love of baseball like I do

Regards,

Mark Ramsey
mramsey81883@... also on messenger quite often.

#2652 From: "timmermant" <timmermant@...>
Date: Mon May 9, 2005 9:06 pm
Subject: Error in Manager table
timmermant
Send Email Send Email
 
I'm not sure if this has been mentioned before, but I believe I found
an error in the Manager table.

I believe
collite99m  1999 ANA     AL
should be 133 games 51 wins 82 losses

Currently he's listed as 1 (of 2) inseason with 162 games (70/192). I
believe his replacement (maddon) is correct.

Tom

#2653 From: Maury Brown <maurybaseballcrazy@...>
Date: Tue May 10, 2005 7:59 pm
Subject: Data on Rainouts?
maurybasebal...
Send Email Send Email
 
Anyone have a DB of rainouts? I'm helping someone in the media on an article and am in need of realivant data.
 
Thanks,
Maury Brown

#2654 From: Maury Brown <maurybaseballcrazy@...>
Date: Wed May 11, 2005 2:59 pm
Subject: Introduction - Maury Brown
maurybasebal...
Send Email Send Email
 
Hi all,
Wanted to take a second and introduce myself to the listserve.
 
My name is Maury Brown and I serve as the co-chair for SABR's Business of Baseball committee, and write columns on the business of baseball for Baseball Think Factory.
 
Myself and Gary Gillette are also the stewards for the late, great Doug Pappas' personal research material.
 
I also designed and maintain Business of Baseball.com, the SABR BoB committee website that gets a bit of use outside of just SABR.
 
I know that there are some great people on this list (and Sean, thanks for approving me to the list).
 
I know that there has been considerable interest in a number of Pappas' research projects, so if there's questions... fire away.
 
By the way... If there are questions on his ejection data, I'm the wrong man to ask. David Vincent is the steward of that material.
 
Thanks,
Maury Brown
Co-Chair - SABR Business of Baseball committee
Columnist - Baseball Think Factory

#2655 From: Maury Brown <maurybaseballcrazy@...>
Date: Tue May 10, 2005 7:53 pm
Subject: Introduction - Maury Brown
maurybasebal...
Send Email Send Email
 
Hi all,
Wanted to take a second and introduce myself to the listserve.
 
My name is Maury Brown and I serve as the co-chair for SABR's Business of Baseball committee, and write columns on the business of baseball for Baseball Think Factory.
 
Myself and Gary Gillette are also the stewards for the late, great Doug Pappas' personal research material.
 
I also designed and maintain Business of Baseball.com, the SABR BoB committee website that gets a bit of use outside of just SABR.
 
I know that there are some great people on this list (and Sean, thanks for approving me to the list).
 
I know that there has been considerable interest in a number of Pappas' research projects, so if there's questions... fire away.
 
By the way... If there are questions on his ejection data, I'm the wrong man to ask. David Vincent is the steward of that material.
 
Thanks,
Maury Brown
Co-Chair - SABR Business of Baseball committee
Columnist - Baseball Think Factory

#2656 From: "KJOK" <kjokbaseball@...>
Date: Thu May 19, 2005 8:46 pm
Subject: Re: Introduction - Maury Brown
kjokbaseball
Send Email Send Email
 
Maury:

For data that your committee is collecting, such as General Managers
by team/year, how will that data be publicly distributed?   Will it
be available in an electronic file format so that it can be
incorporated into other databases, such as the baseball databank?

THANKS,
Kevin Johnson

--- In baseball-databank@yahoogroups.com, Maury Brown
<maurybaseballcrazy@y...> wrote:
> Hi all,
> Wanted to take a second and introduce myself to the listserve.
>
> My name is Maury Brown and I serve as the co-chair for SABR's
Business of Baseball committee, and write columns on the business of
baseball for Baseball Think Factory.
>
> Myself and Gary Gillette are also the stewards for the late, great
Doug Pappas' personal research material.
>
> I also designed and maintain Business of Baseball.com, the SABR BoB
committee website that gets a bit of use outside of just SABR.
>
> I know that there are some great people on this list (and Sean,
thanks for approving me to the list).
>
> I know that there has been considerable interest in a number of
Pappas' research projects, so if there's questions... fire away.
>
> By the way... If there are questions on his ejection data, I'm the
wrong man to ask. David Vincent is the steward of that material.
>
> Thanks,
> Maury Brown
> Co-Chair - SABR Business of Baseball committee
> http://www.businessofbaseball.com
> Columnist - Baseball Think Factory

#2657 From: Maury Brown <maurybaseballcrazy@...>
Date: Thu May 19, 2005 11:39 pm
Subject: Re: Re: Introduction - Maury Brown
maurybasebal...
Send Email Send Email
 
We're currently working on how to distribute the data. Since this is a SABR project, there has been discussion that it is somehow incorporated into the SABR Excyclopedia portion of SABR.org, but this is still up in the air.
 
Thanks,
Maury

KJOK <kjokbaseball@...> wrote:
Maury:

For data that your committee is collecting, such as General Managers
by team/year, how will that data be publicly distributed?   Will it
be available in an electronic file format so that it can be
incorporated into other databases, such as the baseball databank?

THANKS,
Kevin Johnson

--- In baseball-databank@yahoogroups.com, Maury Brown
<maurybaseballcrazy@y...> wrote:
> Hi all,
> Wanted to take a second and introduce myself to the listserve.

> My name is Maury Brown and I serve as the co-chair for SABR's
Business of Baseball committee, and write columns on the business of
baseball for Baseball Think Factory.

> Myself and Gary Gillette are also the stewards for the late, great
Doug Pappas' personal research material.

> I also designed and maintain Business of Baseball.com, the SABR BoB
committee website that gets a bit of use outside of just SABR.

> I know that there are some great people on this list (and Sean,
thanks for approving me to the list).

> I know that there has been considerable interest in a number of
Pappas' research projects, so if there's questions... fire away.

> By the way... If there are questions on his ejection data, I'm the
wrong man to ask. David Vincent is the steward of that material.

> Thanks,
> Maury Brown
> Co-Chair - SABR Business of Baseball committee
> http://www.businessofbaseball.com
> Columnist - Baseball Think Factory




http://www.baseball-databank.org/


#2658 From: baseball-databank@yahoogroups.com
Date: Sat May 21, 2005 6:34 pm
Subject: New file uploaded to baseball-databank
baseball-databank@yahoogroups.com
Send Email Send Email
 
Hello,

This email message is a notification to let you know that
a file has been uploaded to the Files area of the baseball-databank
group.

   File        : /Negro Leagues Data/Leagues Table Summary_Negro Leagues.xls
   Uploaded by : kjokbaseball <kjokbaseball@...>
   Description : Negro Leagues League Table

You can access this file at the URL:
http://groups.yahoo.com/group/baseball-databank/files/Negro%20Leagues%20Data/Lea\
gues%20Table%20Summary_Negro%20Leagues.xls

To learn more about file sharing for your group, please visit:
http://help.yahoo.com/help/us/groups/files

Regards,

kjokbaseball <kjokbaseball@...>

#2659 From: baseball-databank@yahoogroups.com
Date: Sat May 21, 2005 6:35 pm
Subject: New file uploaded to baseball-databank
baseball-databank@yahoogroups.com
Send Email Send Email
 
Hello,

This email message is a notification to let you know that
a file has been uploaded to the Files area of the baseball-databank
group.

   File        : /Negro Leagues Data/Teams_Negro Leagues.xls
   Uploaded by : kjokbaseball <kjokbaseball@...>
   Description : Negro Leagues Teams Table 1920-1950

You can access this file at the URL:
http://groups.yahoo.com/group/baseball-databank/files/Negro%20Leagues%20Data/Tea\
ms_Negro%20Leagues.xls

To learn more about file sharing for your group, please visit:
http://help.yahoo.com/help/us/groups/files

Regards,

kjokbaseball <kjokbaseball@...>

#2660 From: baseball-databank@yahoogroups.com
Date: Sat May 21, 2005 6:35 pm
Subject: New file uploaded to baseball-databank
baseball-databank@yahoogroups.com
Send Email Send Email
 
Hello,

This email message is a notification to let you know that
a file has been uploaded to the Files area of the baseball-databank
group.

   File        : /Negro Leagues Data/NegLg_Parks.xls
   Uploaded by : kjokbaseball <kjokbaseball@...>
   Description : Negro Leagues Park Table

You can access this file at the URL:
http://groups.yahoo.com/group/baseball-databank/files/Negro%20Leagues%20Data/Neg\
Lg_Parks.xls

To learn more about file sharing for your group, please visit:
http://help.yahoo.com/help/us/groups/files

Regards,

kjokbaseball <kjokbaseball@...>

#2661 From: baseball-databank@yahoogroups.com
Date: Sat May 21, 2005 6:36 pm
Subject: New file uploaded to baseball-databank
baseball-databank@yahoogroups.com
Send Email Send Email
 
Hello,

This email message is a notification to let you know that
a file has been uploaded to the Files area of the baseball-databank
group.

   File        : /Negro Leagues Data/NegLg_ParkConfig.xls
   Uploaded by : kjokbaseball <kjokbaseball@...>
   Description : Negro Leagues Parks Configuration Table

You can access this file at the URL:
http://groups.yahoo.com/group/baseball-databank/files/Negro%20Leagues%20Data/Neg\
Lg_ParkConfig.xls

To learn more about file sharing for your group, please visit:
http://help.yahoo.com/help/us/groups/files

Regards,

kjokbaseball <kjokbaseball@...>

#2662 From: "dsreyn" <dreynolds@...>
Date: Fri May 27, 2005 3:18 am
Subject: Batting and pitching consistency checks
dsreyn
Send Email Send Email
 
I have run some consistency checks to compare the sum of individual
player totals in the batting tables with those in the pitching tables
(for example, comparing batters' hits with hits allowed), and
comparing the individual batting and pitching data with team totals.
This has turned up quite a few inconsistencies.

There are only two cases I can think of that should not necessarily
add up - individual shutouts vs. team shutouts, and individual earned
runs vs. team earned runs.  One other exception that *appears* to
exist is the awarding of wins and losses in forfeited games, though
from checking the list of forfeits at Retrosheet, it looks like
decisions are awarded in some forfeits but not others.

In any case, I have posted the results of my consistency checks at
http://alum.wpi.edu/~dreynolds/baseball/baseball.html.  It looks like
many of these, particularly the few recent ones are simply typos.
Tracking down the source of some of the older ones may be more difficult.

Doug

#2663 From: "KJOK" <kjokbaseball@...>
Date: Sat May 28, 2005 7:51 am
Subject: Re: Batting and pitching consistency checks
kjokbaseball
Send Email Send Email
 
Doug:

Great - we probably need to do this more often.  However, your link
below doesn't work due to the "~" that was inserted by Yahoo, so
could you post the complete link?

Also, we probably need to do a check vs. Retrosheet now that
Retrosheet goes back to 1960.  IIRC, BBDB and Retrosheet matched
pretty closely until you got back to 1943, but it would still be good
to check as some of the Retrosheet totals may have changed.

THANKS,
Kevin



--- In baseball-databank@yahoogroups.com, "dsreyn" <dreynolds@a...>
wrote:
> I have run some consistency checks to compare the sum of individual
> player totals in the batting tables with those in the pitching
tables
> (for example, comparing batters' hits with hits allowed), and
> comparing the individual batting and pitching data with team
totals.
> This has turned up quite a few inconsistencies.
>
> There are only two cases I can think of that should not necessarily
> add up - individual shutouts vs. team shutouts, and individual
earned
> runs vs. team earned runs.  One other exception that *appears* to
> exist is the awarding of wins and losses in forfeited games, though
> from checking the list of forfeits at Retrosheet, it looks like
> decisions are awarded in some forfeits but not others.
>
> In any case, I have posted the results of my consistency checks at
> http://alum.wpi.edu/~dreynolds/baseball/baseball.html.  It looks
like
> many of these, particularly the few recent ones are simply typos.
> Tracking down the source of some of the older ones may be more
difficult.
>
> Doug

#2664 From: "David Kent" <wrgptfan@...>
Date: Sat May 28, 2005 2:36 pm
Subject: Re: Re: Batting and pitching consistency checks
davidkentist...
Send Email Send Email
 
It is the period at the end that caused the problem.  Try this instead.

http://alum.wpi.edu/~dreynolds/baseball/baseball.html

...Dave Kent

On Sat, 28 May 2005 07:51:32 -0000, "KJOK" <kjokbaseball@...>
said:
>
> Doug:
> Great - we probably need to do this more often.  However, your
> link
> below doesn't work due to the "~" that was inserted by Yahoo, so
> could you post the complete link?
> Also, we probably need to do a check vs. Retrosheet now that
> Retrosheet goes back to 1960.  IIRC, BBDB and Retrosheet matched
> pretty closely until you got back to 1943, but it would still be
> good
> to check as some of the Retrosheet totals may have changed.
> THANKS,
> Kevin
> --- In baseball-databank@yahoogroups.com, "dsreyn"
> <dreynolds@a...>
> wrote:
> > I have run some consistency checks to compare the sum of
> individual
> > player totals in the batting tables with those in the pitching
> tables
> > (for example, comparing batters' hits with hits allowed), and
> > comparing the individual batting and pitching data with team
> totals.
> > This has turned up quite a few inconsistencies.
> >
> > There are only two cases I can think of that should not
> necessarily
> > add up - individual shutouts vs. team shutouts, and individual
> earned
> > runs vs. team earned runs.  One other exception that *appears*
> to
> > exist is the awarding of wins and losses in forfeited games,
> though
> > from checking the list of forfeits at Retrosheet, it looks
> like
> > decisions are awarded in some forfeits but not others.
> >
> > In any case, I have posted the results of my consistency
> checks at
> > [1]http://alum.wpi.edu/~dreynolds/baseball/baseball.html.  It
> looks
> like
> > many of these, particularly the few recent ones are simply
> typos.
> > Tracking down the source of some of the older ones may be more
> difficult.
> >
> > Doug
> [2]http://www.baseball-databank.org/
>   ___________________________________________________________
>
> Yahoo! Groups Links
>   * To visit your group on the web, go to:
>     [3]http://groups.yahoo.com/group/baseball-databank/
>
>   * To unsubscribe from this group, send an email to:
>     [4]baseball-databank-unsubscribe@yahoogroups.com
>
>   * Your use of Yahoo! Groups is subject to the [5]Yahoo! Terms
>     of Service.
>
> References
>
> 1. http://alum.wpi.edu/~dreynolds/baseball/baseball.html.
> 2. http://www.baseball-databank.org/
> 3. http://groups.yahoo.com/group/baseball-databank/
> 4.
> mailto:baseball-databank-unsubscribe@yahoogroups.com?subject=Unsubscribe
> 5. http://docs.yahoo.com/info/terms/

#2665 From: Sean Forman <sean-forman@...>
Date: Sat May 28, 2005 5:14 pm
Subject: Re: Batting and pitching consistency checks
sforman71
Send Email Send Email
 
dsreyn wrote:
> I have run some consistency checks to compare the sum of individual
> player totals in the batting tables with those in the pitching tables
> (for example, comparing batters' hits with hits allowed), and
> comparing the individual batting and pitching data with team totals.
> This has turned up quite a few inconsistencies.
>
> There are only two cases I can think of that should not necessarily
> add up - individual shutouts vs. team shutouts, and individual earned
> runs vs. team earned runs.  One other exception that *appears* to
> exist is the awarding of wins and losses in forfeited games, though
> from checking the list of forfeits at Retrosheet, it looks like
> decisions are awarded in some forfeits but not others.
>
> In any case, I have posted the results of my consistency checks at
> http://alum.wpi.edu/~dreynolds/baseball/baseball.html.  It looks like
> many of these, particularly the few recent ones are simply typos.
> Tracking down the source of some of the older ones may be more difficult.
>
> Doug



Doug,

Good stuff.  I'm actually shocked that the pitching vs. batting totals
are as good as they are.  I suspect that if we had the dailies from the
Hall of Fame they wouldn't balance up either.  There are countless
errors in the record and retrosheet always finds stuff like this in
their records when they get the season files proofed.

The IP errors in the pitching table are probably a rounding error as at
some point the db had just integer values for innings pitched (most the
team innings miraculously have no partial innings), so that will take a
little bit of work, though we might find out that that is always the
error and can just equal them to the sum of the pitchers on the team.

Also, barring PBP data, I don't think we are going to be able to balance
the SO and BB data for the batting files.


--
Sincerely,
Sean Forman

Baseball Stats!   http://www.Baseball-Reference.com/

#2666 From: Paul Wendt <pgw02472@...>
Date: Sat May 28, 2005 5:19 pm
Subject: Re: Batting and pitching consistency checks
pgw02472
Send Email Send Email
 
KJOK replied to Dave Kent:
>>
. . . we probably need to do a check vs. Retrosheet now that Retrosheet
goes back to 1960. IIRC, BBDB and Retrosheet matched pretty closely until
you got back to 1943, but it would still be good to check as some of the
Retrosheet totals may have changed.
<<

IIUC, Retrosheet season statistics are official, not derived from
game-level retrodata.  There is a perfect match for team games played,
maybe not for any other stat, even team runs scored.

For some discrepancies reported by Dave Kent, the Discrepancies database
now on a Retrosheet back burner(?) must be something to wait for.
Go to #2621 or search "discrepancies" in the archive.

Can we identify kinds of discrepancies in bbdb that are likely to be based
on bbdb compilation or transcription errors, rather than on inherited
discrepancies in the official statistics?


26 May 2005, Dave Kent to bbdb:
>>
I have run some consistency checks to compare the sum of individual player
totals in the batting tables with those in the pitching tables (for
example, comparing batters' hits with hits allowed), and comparing the
individual batting and pitching data with team totals. This has turned up
quite a few inconsistencies.
<<

Who but Pete Palmer knows the number of discrepancies in his version of
the official statistics?  Or someone at Elias regarding theirs?

>>
One other exception that *appears* to exist is the awarding of wins and
losses in forfeited games, though from checking the list of forfeits at
Retrosheet, it looks like decisions are awarded in some forfeits but not
others.
<<

Try this.  The win and loss are attributed to individual pitchers if an
official game is terminated by forfeit to the team that leads.

>>
In any case, I have posted the results of my consistency checks at
      http://alum.wpi.edu/~dreynolds/baseball/baseball.html
It looks like many of these, particularly the few recent ones are simply
typos. Tracking down the source of some of the older ones may be more
difficult.
<<

IIUC, bbdb is likely to include transcription errors only in fielding
games by outfield position (LF-CF-RF).

Batting, fielding, and pitching lines by stint, for multi-team players,
were compiled by Pete Palmer and other latterday researchers.  Maybe that
introduced discrepancies between players team totals.

For 1901, Dave Kent reports three discrepancies between the sums of
individual batting and pitching data.

>>
Individual Batting vs. Pitching - AL, 1901-1996 (Baseball Databank
version)
                 InBat InPit
1901 AL HBP:     412   447

Individual Batting vs. Pitching - NL, 1901-1996 (Baseball Databank
version)
                 InBat InPit
1901 NL Runs:   5197  5193
1901 NL HBP:     418   439
<<

For many seasons, the now-official HB data was compiled by a SABR project.
Perhaps the notable aggregate shortfall in batting HB is the number of
genuine HB that could not be attributed to a particular batter.  Many old
box scores do identify HB by pitcher but not by batter.

I'm sorry this degenerated into miscellaneous observations.

Paul Wendt

#2667 From: "KJOK" <kjokbaseball@...>
Date: Sat May 28, 2005 6:46 pm
Subject: Re: Batting and pitching consistency checks
kjokbaseball
Send Email Send Email
 
"IIUC, Retrosheet season statistics are official, not derived from
game-level retrodata.  There is a perfect match for team games
played,
maybe not for any other stat, even team runs scored."

Oooh, I didn't remember it this way.  If that's true, then maybe
someone can calculate the Retrosheet totals from game-level retrodata?

--- In baseball-databank@yahoogroups.com, Paul Wendt <pgw02472@y...>
wrote:
> KJOK replied to Dave Kent:
> >>
> . . . we probably need to do a check vs. Retrosheet now that
Retrosheet
> goes back to 1960. IIRC, BBDB and Retrosheet matched pretty closely
until
> you got back to 1943, but it would still be good to check as some
of the
> Retrosheet totals may have changed.
> <<
>
> IIUC, Retrosheet season statistics are official, not derived from
> game-level retrodata.  There is a perfect match for team games
played,
> maybe not for any other stat, even team runs scored.
>
> For some discrepancies reported by Dave Kent, the Discrepancies
database
> now on a Retrosheet back burner(?) must be something to wait for.
> Go to #2621 or search "discrepancies" in the archive.
>
> Can we identify kinds of discrepancies in bbdb that are likely to
be based
> on bbdb compilation or transcription errors, rather than on
inherited
> discrepancies in the official statistics?
>
>
> 26 May 2005, Dave Kent to bbdb:
> >>
> I have run some consistency checks to compare the sum of individual
player
> totals in the batting tables with those in the pitching tables (for
> example, comparing batters' hits with hits allowed), and comparing
the
> individual batting and pitching data with team totals. This has
turned up
> quite a few inconsistencies.
> <<
>
> Who but Pete Palmer knows the number of discrepancies in his
version of
> the official statistics?  Or someone at Elias regarding theirs?
>
> >>
> One other exception that *appears* to exist is the awarding of wins
and
> losses in forfeited games, though from checking the list of
forfeits at
> Retrosheet, it looks like decisions are awarded in some forfeits
but not
> others.
> <<
>
> Try this.  The win and loss are attributed to individual pitchers
if an
> official game is terminated by forfeit to the team that leads.
>
> >>
> In any case, I have posted the results of my consistency checks at
>      http://alum.wpi.edu/~dreynolds/baseball/baseball.html
> It looks like many of these, particularly the few recent ones are
simply
> typos. Tracking down the source of some of the older ones may be
more
> difficult.
> <<
>
> IIUC, bbdb is likely to include transcription errors only in
fielding
> games by outfield position (LF-CF-RF).
>
> Batting, fielding, and pitching lines by stint, for multi-team
players,
> were compiled by Pete Palmer and other latterday researchers.
Maybe that
> introduced discrepancies between players team totals.
>
> For 1901, Dave Kent reports three discrepancies between the sums of
> individual batting and pitching data.
>
> >>
> Individual Batting vs. Pitching - AL, 1901-1996 (Baseball Databank
> version)
>                 InBat InPit
> 1901 AL HBP:     412   447
>
> Individual Batting vs. Pitching - NL, 1901-1996 (Baseball Databank
> version)
>                 InBat InPit
> 1901 NL Runs:   5197  5193
> 1901 NL HBP:     418   439
> <<
>
> For many seasons, the now-official HB data was compiled by a SABR
project.
> Perhaps the notable aggregate shortfall in batting HB is the number
of
> genuine HB that could not be attributed to a particular batter.
Many old
> box scores do identify HB by pitcher but not by batter.
>
> I'm sorry this degenerated into miscellaneous observations.
>
> Paul Wendt

#2668 From: "dsreyn" <dreynolds@...>
Date: Sat May 28, 2005 4:47 pm
Subject: Re: Batting and pitching consistency checks
dsreyn
Send Email Send Email
 
I initially omitted intentional walks in the individual batting vs.
individual pitching comparisons, so I have updated the files on my
site.  And as Dave pointed out, the correct link is:

http://alum.wpi.edu/~dreynolds/baseball/baseball.html

Doug

--- In baseball-databank@yahoogroups.com, "David Kent" <wrgptfan@f...>
wrote:
> It is the period at the end that caused the problem.  Try this instead.
>
> http://alum.wpi.edu/~dreynolds/baseball/baseball.html
>
> ...Dave Kent
>
> On Sat, 28 May 2005 07:51:32 -0000, "KJOK" <kjokbaseball@y...>
> said:
> >
> > Doug:
> > Great - we probably need to do this more often.  However, your
> > link
> > below doesn't work due to the "~" that was inserted by Yahoo, so
> > could you post the complete link?
> > Also, we probably need to do a check vs. Retrosheet now that
> > Retrosheet goes back to 1960.  IIRC, BBDB and Retrosheet matched
> > pretty closely until you got back to 1943, but it would still be
> > good
> > to check as some of the Retrosheet totals may have changed.
> > THANKS,
> > Kevin
> > --- In baseball-databank@yahoogroups.com, "dsreyn"
> > <dreynolds@a...>
> > wrote:
> > > I have run some consistency checks to compare the sum of
> > individual
> > > player totals in the batting tables with those in the pitching
> > tables
> > > (for example, comparing batters' hits with hits allowed), and
> > > comparing the individual batting and pitching data with team
> > totals.
> > > This has turned up quite a few inconsistencies.
> > >
> > > There are only two cases I can think of that should not
> > necessarily
> > > add up - individual shutouts vs. team shutouts, and individual
> > earned
> > > runs vs. team earned runs.  One other exception that *appears*
> > to
> > > exist is the awarding of wins and losses in forfeited games,
> > though
> > > from checking the list of forfeits at Retrosheet, it looks
> > like
> > > decisions are awarded in some forfeits but not others.
> > >
> > > In any case, I have posted the results of my consistency
> > checks at
> > > [1]http://alum.wpi.edu/~dreynolds/baseball/baseball.html.  It
> > looks
> > like
> > > many of these, particularly the few recent ones are simply
> > typos.
> > > Tracking down the source of some of the older ones may be more
> > difficult.
> > >
> > > Doug
> > [2]http://www.baseball-databank.org/
> >   ___________________________________________________________
> >
> > Yahoo! Groups Links
> >   * To visit your group on the web, go to:
> >     [3]http://groups.yahoo.com/group/baseball-databank/
> >
> >   * To unsubscribe from this group, send an email to:
> >     [4]baseball-databank-unsubscribe@yahoogroups.com
> >
> >   * Your use of Yahoo! Groups is subject to the [5]Yahoo! Terms
> >     of Service.
> >
> > References
> >
> > 1. http://alum.wpi.edu/~dreynolds/baseball/baseball.html.
> > 2. http://www.baseball-databank.org/
> > 3. http://groups.yahoo.com/group/baseball-databank/
> > 4.
> >
mailto:baseball-databank-unsubscribe@yahoogroups.com?subject=Unsubscribe
> > 5. http://docs.yahoo.com/info/terms/

#2669 From: "tjruane" <truane@...>
Date: Sun May 29, 2005 1:10 pm
Subject: Re: Batting and pitching consistency checks
tjruane
Send Email Send Email
 
Paul Wendt wrote:

> IIUC, Retrosheet season statistics are official, not derived
> from game-level retrodata.  There is a perfect match for team
> games played, maybe not for any other stat, even team runs
> scored.

This is true.  In the new release, Retrosheet started using official
stats whenever possible.

> For some discrepancies reported by Dave Kent, the Discrepancies
> database now on a Retrosheet back burner(?) must be something
> to wait for.  Go to #2621 or search "discrepancies" in the
> archive.

This is not on the back burner, but it is a large undertaking that we
hope to unveil in the 2006 release.  For each game-level stats that we
disagree with, we will have a record in our discrepancy database.
Each record has the following comma-delimited fields:

discrepancy-id,
player-id,
year,
team,
type, - O (offense), P (pitching) or D (defense)
position, - for D category discrepancies
category, - an abbreviation of the stat we disagree about
game-id, - is the discrepancy is related to a specific game
our value,
official value,
cross-reference, - the ID of a related discrepancy
class of discrepancy, - a code identifying some common causes
comment

Two examples from 1963 NL:

1963N002,batej102,1963,HOU,O,,GDP,HOU196308240,
   0,1,1963N010,"SWAP,IMP","1-3 with 2 Ks and 1 GDP"
1963N010,wynnj101,1963,HOU,O,,GDP,HOU196308240,1,0,1963N002,SWAP,

This identifies two related offensive discrepancies in which
officially, John Bateman was credited with grounding into a DP that
should have been credited to Jimmy Wynn.  In addition, this official
error resulted in an impossible batting line for Bateman.

To give people an idea of the scope of this undertaking, there are 356
records for the 1963 NL alone, and this was a good year.  The 1967 AL
has 459 records while the 1969 AL has 558 records and is not even
complete yet.

The plan is to make this database available when we have enough
leagues covered to make it interesting.  In addition, we hope to tie
this into the html files so that whenever we display a disputed
statistic we turn that stat into a link to a file explaining the
discrepancy.

Tom Ruane

#2670 From: Paul Wendt <pgw02472@...>
Date: Sun May 29, 2005 6:27 pm
Subject: Re: Batting and pitching consistency check
pgw02472
Send Email Send Email
 
Tom Ruane replied, in part:
>
For each game-level stats that we disagree with, we will have a record in our
discrepancy database.
. . .
game-id, - is [? if?] the discrepancy is related to a specific game
<

Tom,

Will you use records with null game-id to cover some season-level aggregation
discrepancies?  Eg, the official number of starts for a pitcher or runs scored
for a team is unequal to the aggregate of game-level data.

Paul

#2671 From: "tjruane" <truane@...>
Date: Mon May 30, 2005 2:10 pm
Subject: Re: Batting and pitching consistency check
tjruane
Send Email Send Email
 
Paul Wendt wrote:

> Will you use records with null game-id to cover some season-level
> aggregation discrepancies?  Eg, the official number of starts for
> a pitcher or runs scored for a team is unequal to the aggregate
> of game-level data.

Yes, if the discrepancy is not related to a specific game, then the
game-id field will be empty.  Typically, the classes associated with
these kinds of discrepancies are "ADD" (if there was an apparent error
adding the daily lines) or "COPY" (if the sum lines in the official
dailies agree with us, but there was an apparent error transcribing
the total).

Tom Ruane

#2672 From: Paul Wendt <pgw02472@...>
Date: Tue May 31, 2005 8:32 pm
Subject: LF-CF-RF missing from FieldingOF
pgw02472
Send Email Send Email
 
My mailprogram breaks long incoming lines and permits long outgoing lines.
I don't know about Yahoo Groups or your mailprogram.

Joining the Fielding and FieldingOF tables in lahman52, I find 36 stints
with OF data but no LF-CF-RF data in FieldingOF.  There may also be a
problem re Kowalik and the last class of 31 (below) at retrosheet.

There are two since 1926, both multistint player-seasons.

playerID yearID stint teamID lgID G C 1B 2B 3B SS LF CF RF OF P PS
kowalfa01 1936 2 PHI NL 26 							 4 22 8
kowalfa01 1936 3 BSN NL 1 				 0 1 2  1 1

Maybe LF-CF-RF games for stint 2 are assigned to stint 3.

playerID yearID stint teamID lgID G C 1B 2B 3B SS LF CF RF OF P PS
morgajo01 1960 1 PHI NL 24 		 24  0 0 2
morgajo01 1960 2 CLE AL 14 		 12 			 2

Maybe LF-CF-RF games for stint 1 are assigned to stint 2.

At bb-ref (and hence in the bbdb?), it appears to me that Joe Morgan has
been corrected but Fabian Kowalik has been revised incorrectly.  Kowalik
is a pitcher without batting/fielding data in TB6 or new BBE.  Neft/Cohen
1995 shows Kowalik 1936 with 3 of games Philly and none Boston, presumably
from older data.  That fits the plausibility that lahman52 makes the same
mistake as for Morgan regarding the two stints.  But mlb.com (with 1999
Palmer data) and retrosheet suggest that something else is afoot.
      http://retrosheet.org/boxesetc/Pkowaf101.htm
http://mlb.mlb.com/NASApp/mlb/mlb/stats_historical/mlb_individual_stats_player.j\
sp?playerID=117280

Three of the problem stints in lahman52 are for hoganed02, perhaps by
oversight when the records for -ed01 and -ed02 were allocated (recently).
The splits for Hogan are available at bb-ref:
1884 0 0 11
1887 1 2 26
1888 26 0 52

The other 31 problem stints are for primary pitchers most of whom played
fewer than 8 OF games.  Several, maybe all, are missing from the TB6 and
BBE batting registers.  Retrosheet shows 0-0-0 games for Will White
      http://retrosheet.org/boxesetc/Pwhitw103.htm
but I wonder whether LF-CF-RF data is *missing* (by clerical error)
rather than zero.  That is, I am not sure that the latterday LF-CF-RF
data-gathering project found zero outfield games for White, as his
Retrosheet fielding table suggests.

Paul Wendt

#2673 From: "dsreyn" <dreynolds@...>
Date: Wed Jun 1, 2005 3:05 am
Subject: Candidate corrections, 1997-2001
dsreyn
Send Email Send Email
 
I've been trying to track down the source of some of the
inconsistencies in the lists I posted recently.  If the corrections
indicated below are correct, all of the problems I found from
1997-2001 will be resolved.  I have used The Sporting News Baseball
Guides as my "truth" source.

--------------------------------------------------

1997

- Batting table
TSN credits Rafael Palmeiro (BAL) with 7 IBB (BDB has 6).

- Pitching table
TSN credits John Johnstone (SFN) with 18.7 innings (BDB has 18.3).

TSN credits Wilson Alvarez (SFN) with 69 strikeouts (BDB has 36).
Thus, the team strikeout total for SFN in the teams table is correct.

- Teams table
TSN's team innings pitched totals should match the sums of the
individual pitchers innings for all teams in BDB.  Outs pitched should
be changed for the following:  ANA - 4364, CHA - 4267, CLE - 4277, DET
- 4337, NYA - 4403, OAK - 4336, SEA - 4343, TOR - 4328, ATL - 4397,
COL - 4298, FLO - 4340, LAN - 4378, NYN - 4378, PHI - 4261.  NOTE:
SFN outs are correct as given in the teams table at 4338; the mismatch
is due to the Johnstone's individual statistics (see above).

Team strikeouts for ML4 should be 1016 (the sum of the individual totals).

Team hits allowed and BB allowed for BOS should be 1569 and 611 (the
sum of the individual totals).

--------------------------------------------------

1998

- Pitching table
The runs allowed for several pitchers are incorrect (the team totals
in the team tables are correct):
Doug Bochtler (DET) - TSN has 48 (BDB has 46)
Danny Rios (KCA) - TSN has 9 (BDB has 8)
Hideki Irabu (NYA) - TSN has 79 (BDB has 78)
Brian Anderson (ARI) - TSN has 109 (BDB has 100)
Felix Heredia (CHN) - TSN has 9 (BDB has 8)
Antonio Alfonseca (FLO) - TSN has 36 (BDB has 32)
Felix Heredia (FLO) - TSN has 30 (BDB has 25)
Bobby Jones (NYN) - TSN has 94 (BDB has 88)

The HBP for several pitchers are incorrect:
John Thomson (COL) - TSN has 2 (BDB has 0)
Gabe Gonzalez (FLO) - TSN has 1 (BDB has 0)
Benj Sampson (MIN) - TSN has 1 (BDB has 0)

The IBB for two pitchers are incorrect:
Pete Harnisch (CIN) - TSN has 4 (BDB has 1)
Rob Stanifer (FLO) - TSN has 2 (BDB has 0)

TSN credits Greg McMichael (NYN) with 53.7 innings pitched (BDB has 53.3).

- Teams table
TSN's team innings pitched totals should match the sums of the
individual pitchers innings for all teams in BDB.  Outs pitched should
be changed for the following:  BAL - 4294, CHA - 4316, DET - 4339, KCA
- 4309, MIN - 4343, NYA - 4370, SEA - 4273, TEX - 4294, ARI - 4297,
ATL - 4316, CHN - 4432, CIN - 4324, COL - 4298, FLO - 4349, HOU -
4414, LAN - 4342, SDN - 4364.  NOTE:  NYN outs are correct as given in
the teams table at 4374; the mismatch is due to McMichael's individual
statistics (see above).

--------------------------------------------------

1999

- Pitching table
TSN credits LaTroy Hawkins (MIN) with 174.3 innings (BDB has 175.3).

- Teams table
TSN's team innings pitched totals match the sums of the individual
pitchers innings for all teams in BDB.  Outs pitched should be changed
for the following:  ANA - 4294, BOS - 4310, CHA - 4315, CLE - 4351,
DET - 4263, KCA - 4262, MIN - 4270, NYA - 4319, OAK - 4315, SEA -
4301, TBA - 4299, TEX - 4309, TOR - 4317, ARI - 4402, CHN - 4292, CIN
- 4386, COL - 4287, FLO - 4307, HOU - 4376, LAN - 4359, MIL - 4328,
MON - 4303, NYN - 4370, PHI - 4315, PIT - 4300, SDN - 4261, SFN -
4369, SLN - 4336.  NOTE:  the total quoted here for MIN assumes that
Hawkins' innings are corrected as above.

--------------------------------------------------

2000

- Batting table
TSN credits Steve Cox (TBA) with 5 HBP (BDB has 4).

TSN credits Alex Ochoa (CIN) with 9 stolen bases (the
www.baseball1.com version has has 8; the www.baseball-databank.org
version has 9).

- Pitching table
TSN credits Masato Yoshii (COL) with 6 IBB (BDB has 5).
TSN credits Frank Rodriguez (SEA) with 2 IBB (BDB has 1).

- Teams table
Batting HBP for TBA should be 51 (BDB has 48).

Stolen bases for CIN should be 100 (both BDB versions have 99).

TSN's team innings pitched totals match the sums of the individual
pitchers innings for all teams in BDB.  Outs pitched should be changed
for the following:  BAL - 4300, BOS - 4358, CHA - 4351, CLE - 4327,
DET - 4330, KCA - 4318, MIN - 4298, NYA - 4273, OAK - 4306, SEA -
4325, TBA - 4294, TOR - 4312, ARI - 4331, ATL - 4321, CHN - 4364, CIN
- 4369, FLO - 4289, HOU - 4313, MIL - 4399, MON - 4274, PHI - 4316,
SDN - 4378, SFN - 4333, SLN - 4301.

--------------------------------------------------

2001

- Batting table
TSN credits Tony Batista with 1 IBB with TOR, and 0 IBB with BAL (BDB
credits him with 1 IBB with each team).
TSN credits Mike Bordick (BAL) with 1 IBB (BDB has 0).
TSN credits Delino DeShields (BAL) with 1 IBB (BDB has 0).

--------------------------------------------------

More to come.

Doug

#2674 From: "dsreyn" <dreynolds@...>
Date: Wed Jun 1, 2005 1:11 am
Subject: Candidate corrections, 1997-2001
dsreyn
Send Email Send Email
 
I've been trying to track down the source of some of the
inconsistencies in the lists I posted recently.  If the corrections
indicated below are correct, all of the problems I found from
1997-2001 will be resolved.  I have used The Sporting News Baseball
Guides as my "truth" source.

--------------------------------------------------

1997

- Batting table
TSN credits Rafael Palmeiro (BAL) with 7 IBB (BDB has 6).

- Pitching table
TSN credits John Johnstone (SFN) with 18.7 innings (BDB has 18.3).

TSN credits Wilson Alvarez (SFN) with 69 strikeouts (BDB has 36).
Thus, the team strikeout total for SFN in the teams table is correct.

- Teams table
TSN's team innings pitched totals should match the sums of the
individual pitchers innings for all teams in BDB.  Outs pitched should
be changed for the following:  ANA - 4364, CHA - 4267, CLE - 4277, DET
- 4337, NYA - 4403, OAK - 4336, SEA - 4343, TOR - 4328, ATL - 4397,
COL - 4298, FLO - 4340, LAN - 4378, NYN - 4378, PHI - 4261.  NOTE:
SFN outs are correct as given in the teams table at 4338; the mismatch
is due to the Johnstone's individual statistics (see above).

Team strikeouts for ML4 should be 1016 (the sum of the individual totals).

Team hits allowed and BB allowed for BOS should be 1569 and 611 (the
sum of the individual totals).

--------------------------------------------------

1998

- Pitching table
The runs allowed for several pitchers are incorrect (the team totals
in the team tables are correct):
Doug Bochtler (DET) - TSN has 48 (BDB has 46)
Danny Rios (KCA) - TSN has 9 (BDB has 8)
Hideki Irabu (NYA) - TSN has 79 (BDB has 78)
Brian Anderson (ARI) - TSN has 109 (BDB has 100)
Felix Heredia (CHN) - TSN has 9 (BDB has 8)
Antonio Alfonseca (FLO) - TSN has 36 (BDB has 32)
Felix Heredia (FLO) - TSN has 30 (BDB has 25)
Bobby Jones (NYN) - TSN has 94 (BDB has 88)

The HBP for several pitchers are incorrect:
John Thomson (COL) - TSN has 2 (BDB has 0)
Gabe Gonzalez (FLO) - TSN has 1 (BDB has 0)
Benj Sampson (MIN) - TSN has 1 (BDB has 0)

The IBB for two pitchers are incorrect:
Pete Harnisch (CIN) - TSN has 4 (BDB has 1)
Rob Stanifer (FLO) - TSN has 2 (BDB has 0)

TSN credits Greg McMichael (NYN) with 53.7 innings pitched (BDB has 53.3).

- Teams table
TSN's team innings pitched totals should match the sums of the
individual pitchers innings for all teams in BDB.  Outs pitched should
be changed for the following:  BAL - 4294, CHA - 4316, DET - 4339, KCA
- 4309, MIN - 4343, NYA - 4370, SEA - 4273, TEX - 4294, ARI - 4297,
ATL - 4316, CHN - 4432, CIN - 4324, COL - 4298, FLO - 4349, HOU -
4414, LAN - 4342, SDN - 4364.  NOTE:  NYN outs are correct as given in
the teams table at 4374; the mismatch is due to McMichael's individual
statistics (see above).

--------------------------------------------------

1999

- Pitching table
TSN credits LaTroy Hawkins (MIN) with 174.3 innings (BDB has 175.3).

- Teams table
TSN's team innings pitched totals match the sums of the individual
pitchers innings for all teams in BDB.  Outs pitched should be changed
for the following:  ANA - 4294, BOS - 4310, CHA - 4315, CLE - 4351,
DET - 4263, KCA - 4262, MIN - 4270, NYA - 4319, OAK - 4315, SEA -
4301, TBA - 4299, TEX - 4309, TOR - 4317, ARI - 4402, CHN - 4292, CIN
- 4386, COL - 4287, FLO - 4307, HOU - 4376, LAN - 4359, MIL - 4328,
MON - 4303, NYN - 4370, PHI - 4315, PIT - 4300, SDN - 4261, SFN -
4369, SLN - 4336.  NOTE:  the total quoted here for MIN assumes that
Hawkins' innings are corrected as above.

--------------------------------------------------

2000

- Batting table
TSN credits Steve Cox (TBA) with 5 HBP (BDB has 4).

TSN credits Alex Ochoa (CIN) with 9 stolen bases (the
www.baseball1.com version has has 8; the www.baseball-databank.org
version has 9).

- Pitching table
TSN credits Masato Yoshii (COL) with 6 IBB (BDB has 5).
TSN credits Frank Rodriguez (SEA) with 2 IBB (BDB has 1).

- Teams table
Batting HBP for TBA should be 51 (BDB has 48).

Stolen bases for CIN should be 100 (both BDB versions have 99).

TSN's team innings pitched totals match the sums of the individual
pitchers innings for all teams in BDB.  Outs pitched should be changed
for the following:  BAL - 4300, BOS - 4358, CHA - 4351, CLE - 4327,
DET - 4330, KCA - 4318, MIN - 4298, NYA - 4273, OAK - 4306, SEA -
4325, TBA - 4294, TOR - 4312, ARI - 4331, ATL - 4321, CHN - 4364, CIN
- 4369, FLO - 4289, HOU - 4313, MIL - 4399, MON - 4274, PHI - 4316,
SDN - 4378, SFN - 4333, SLN - 4301.

--------------------------------------------------

2001

- Batting table
TSN credits Tony Batista with 1 IBB with TOR, and 0 IBB with BAL (BDB
credits him with 1 IBB with each team).
TSN credits Mike Bordick (BAL) with 1 IBB (BDB has 0).
TSN credits Delino DeShields (BAL) with 1 IBB (BDB has 0).

--------------------------------------------------

More to come.

Doug

#2675 From: Paul Wendt <pgw02472@...>
Date: Wed Jun 1, 2005 3:41 pm
Subject: Re: LF-CF-RF missing from FieldingOF
pgw02472
Send Email Send Email
 
On Tue, 31 May 2005, Paul Wendt wrote:

> Joining the Fielding and FieldingOF tables in lahman52, I find 36 stints
> with OF data but no LF-CF-RF data in FieldingOF.  There may also be a
> problem re Kowalik and the last class of 31 (below) at retrosheet.
. . .
> The other 31 problem stints are for primary pitchers most of whom played
> fewer than 8 OF games.

Another pattern is clear.  Too bad I must race for a bus yesterday.
The "other 31" ordered by player-team-stint:

(tab delimited, for ASCII readers)
playerID yearID stint LF CF RF OF P PS
whitewi01 1882 1 		 2 54 54
whitewi01 1880 1 		 3 62 62
whitrbi01 1890 1 		 1 16 11
whitrbi01 1894 2 		 7 10 8
wickebo01 1902 1 		 3 22 16
wickebo01 1904 1 		 20 30 27
wickebo01 1905 1 		 3 22 22
widnewi01 1888 1 		 2 13 13
widnewi01 1889 1 		 1 41 34
wilheka01 1903 1 		 1 12 9
wilheka01 1905 1 		 4 34 27
willeed01 1909 1 		 1 41 34
willeed01 1910 1 		 1 37 25
willile03 1926 1 		 1 8 0
willipo01 1903 3 		 2 10 10
willipo01 1902 1 		 6 31 31
willito01 1892 1 		 1 2 1
willito01 1893 1 		 3 5 2
wilsohi01 1903 1 		 1 30 28
wilsoze01 1897 1 		 2 34 30
wilsoze01 1898 1 		 3 33 31
wiltsho01 1905 1 		 1 32 19
wiltsho01 1906 1 		 2 38 26
wiltsho01 1904 1 		 1 24 16
wiltsho01 1907 1 		 1 33 21
wiltssn01 1902 2 		 4 19 18
wingaer01 1925 1 		 1 32 18
wingaer01 1926 1 		 2 39 16
wisebi01 1886 1 		 1 1 1
wisebi01 1882 1 		 2 3 3
wisebi01 1884 1 		 43 50 41

It appears that a segment of pitcher-outfielders defined partly by
lastname was lost or overlooked.

> Several, maybe all, are missing from the TB6 and BBE batting registers.
> Retrosheet shows 0-0-0 games for Will White
>    http://retrosheet.org/boxesetc/Pwhitw103.htm
> but I wonder whether LF-CF-RF data is *missing* (by clerical error)
> rather than zero.  That is, I am not sure that the latterday LF-CF-RF
> data-gathering project found zero outfield games for White, as his
> Retrosheet fielding table suggests.

I checked the Wicker and Wise whose seven P-OF stints include two with
20 and 43 OF games played.
      http://retrosheet.org/boxesetc/Pwickb101.htm
      http://retrosheet.org/boxesetc/Pwiseb101.htm
Yes, I think the zeroes at Retrosheet represent missing data.

Paul Wendt

#2676 From: "dsreyn" <dreynolds@...>
Date: Sat Jun 4, 2005 3:04 pm
Subject: More corrections (1991-1996)
dsreyn
Send Email Send Email
 
These corrections will resolve all of the inconsistencies I found from
1991-1996.  I have used a combination of the annual TSN Baseball
Guides and the Stats Inc. All-Time Baseball Sourcebook to check these.

Doug

--------------------------------------------------

1991

- Batting table
TSN credits Jose Canseco (OAK) with 8 IBB (BDB has 7).

- Teams table
TSN's team innings pitched totals match the sums of the individual
pitchers innings for all teams in BDB.  Outs pitched should be changed
for the following:  BAL - 4373, BOS - 4319, CAL - 4325, CLE - 4324,
DET - 4351, MIN - 4348, ML4 - 4391, OAK - 4333, SEA - 4393, TOR -
4388, ATL - 4358, CHN - 4370, MON - 4321, NYN - 4312, PIT - 4370, SDN
- 4358, SLN - 4306.

--------------------------------------------------

1992

- Teams table
TSN's team innings pitched totals match the sums of the individual
pitchers innings for all teams in BDB.  Outs pitched should be changed
for the following:  BOS - 4346, CHA - 4385, DET - 4307, KCA - 4342,
NYA - 4358, TEX - 4381, TOR - 4322, CIN - 4349, HOU - 4378, NYN -
4340, PIT - 4439, SDN - 4384.

--------------------------------------------------

1993

- Teams table
TSN's team innings pitched totals match the sums of the individual
pitchers innings for all teams in BDB.  Outs pitched should be changed
for the following:  BAL - 4328, BOS - 4357, CAL - 4291, CLE - 4337,
DET - 4310, KCA - 4336, MIN - 4333, NYA - 4315, OAK - 4357, SEA -
4361, TEX - 4315, TOR - 4324, CHN - 4349, COL - 4294, FLO - 4321, HOU
- 4324, LAN - 4418, MON - 4370, PHI - 4418, PIT - 4337, SDN - 4313,
SFN - 4370.

--------------------------------------------------

1994

- Teams table
TSN shows OAK with 589 runs allowed, and LAN with 509.  These totals
match the individual totals in BDB (the entries in the team table are
incorrect).

TSN's team innings pitched totals match the sums of the individual
pitchers innings for all teams in BDB.  Outs pitched should be changed
for the following:  BAL - 2993, BOS - 3088, CLE - 3056, KCA - 3095,
NYA - 3059, OAK - 3010, ATL - 3079, CHN - 3071, CIN - 3115, HOU -
3089, LAN - 3042, MON - 3110, PHI - 3073, PIT - 3017, SDN - 3137, SFN
- 3076.

--------------------------------------------------

1995

- Pitching table
TSN credits Tim Davis (SEA) with 24.0 innings (BDB has 20.3).

- Teams table
TSN's team innings pitched totals match the sums of the individual
pitchers innings for all teams in BDB.  Outs pitched should be changed
for the following:  BOS - 3878, CAL - 3853, CHA - 3854, MIN - 3818,
NYA - 3854, TOR - 3878, ATL - 3875, CIN - 3868, COL - 3865, HOU -
3961, MON - 3851, PHI - 3871, PIT - 3826, SDN - 3854, SFN - 3881, SLN
- 3797.

The outs pitched for SEA should be 3868 (consistent with the
individual totals including the correction to Davis' innings, above).

--------------------------------------------------

1996

- Batting table
TSN credits Wally Joyner (SDN) with 8 IBB (BDB has 7).
TSN credits Luis Alicea (SLN) with 10 IBB (BDB has 9).

- Pitching table
TSN credits Tim Pugh (CIN) with 15.7 innings (BDB has 15.0).

- Teams table
TSN's team innings pitched totals match the sums of the individual
pitchers innings for all teams in BDB.  Outs pitched should be changed
for the following:  BAL - 4406, CLE - 4357, DET - 4298, MIN - 4319,
ML4 - 4342, OAK - 4369, SEA - 4295, TEX - 4348, TOR - 4337, CHN -
4369, COL - 4268, LAN - 4399, PHI - 4270, SFN - 4327, SLN - 4357.
NOTE:  CIN outs are correct as given in the teams table at 4327; the
mismatch is due to the Pugh's individual statistics (see above).

Messages 2647 - 2676 of 4385   Oldest  |  < Older  |  Newer >  |  Newest
Add to My Yahoo!      XML What's This?

Copyright © 2010 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines NEW - Help