This is our old blog. It hasn't been active since 2011. Please see the link above for our current blog or click the logo above to see all of the great data and content on this site.

Similarity Scores – How Many Clicks to Babe Ruth?

Posted by Andy on December 15, 2010

Reader Matthew Bohnert wrote in describing a game that I've actually played myself with the Baseball-Reference.com player pages. Matthew writes:

"One bizarre and silly suggestion for a time-waster. I am fascinated by Similarity Scores. I can’t really say why. But I am.

One mindless game I like to play is “How Many Clicks To Babe Ruth”? The game starts by picking any batter. Any - at random, from any point in baseball history. Click on their page. It could be Felix Millan. It could be Ted Williams. It doesn’t matter.

Then you start plotting a strategy whereby you click through the Top 10 Similarity Scores shown on the bottom of the page, linking your way to Babe Ruth in as few clicks as possible. Hank Aaron obviously takes only 1 link. Somebody like Enzo Hernandez or Johnnie LeMaster usually takes about 25 or 30. But it can be done. I have yet to find a player where it’s impossible."

I went back and asked Matthew to send me his click progression to get from Johnnie LeMaster to Babe Ruth and here's what he sent back:

"Here's the link progression for Johnnie LeMaster.   It's not optimized....in other words, I just fumbled through it right now and you could certainly cut out some 3-4 link loops and probably find a faster path.

It's also fairly typical for a light-hitting infielder, whereby you mess around and don't get anywhere for about 10-12 links, and then you stumble upon a good player with some decent power totals, like Ryne Sandberg.   Once you get traction there, you know it's just a matter of time before you link your way to the promised land.

I've also found that Vern Stephens pops up frequently as a critical link around steps 10-12, to connect underwhelming batting careers with the Sultan of Swat.

1. Johnnie LeMaster
2. Dick Schofield
3. Mike Gallego
4. Brian Giles
5. Gene Mauch
6. Jerry Dybzinski
7. Ron Gardenhire
8. Tony Pena
9. Kiko Garcia
10.  Mike Phillips
11.  Frank White (here's where we start to take off)
12.  Bill Mazeroski
13.  Roberto Alomar
14.  Ryne Sandberg (now we know it's just a matter of time)
15.  Lou Whitaker
16.  Joe Morgan
17.  Craig Biggio
18.  George Brett
19.  Dave Winfield (closing in for the kill)
20.  Frank Robinson
21.  Hank Aaron
22.  Babe Ruth

So 22 links from Johnnie LeMaster to Babe Ruth.

What a great game!   You can have fun with your friends.  Impress chicks.  Wow business associates.   All on Baseball-Reference.com!"

I imagine a lot of readers out there have played this game already. What have you discovered?

69 Responses to “Similarity Scores – How Many Clicks to Babe Ruth?”

  1. carlsonjok Says:

    I would think a tool like the Oracle of Bacon could be modified to provide the optimized answer to compare your "manual" attempt against.

  2. Andrew Says:

    Just drove myself crazy trying to do this for Juan Pierre...

  3. Jimbo Says:

    I tried to find a faster route from Brett to Ruth but couldn't do it.

  4. Jimbo Says:

    and haha I was doing it for Alfredo Griffin! Went crazy and gave up.

  5. WanderingWinder Says:

    I was doing this last week...

  6. Lascor Says:

    It took me 14 to get to Babe.

    1. Johnnie LeMaster
    2. Charlie O'Leary
    3. Everett Scott
    4. Roger Peckinpaugh
    5. Luis Aparicio
    6. Red Schoendienst
    7. Billy Herman
    8. Joe Sewell
    9. Dereck Jeter
    10. Pete Rose
    11. Ty Cobb
    12. Stan Musial
    13. Lou Gerigh
    14. Babe Ruth

    Totally cool game! Congrats!

  7. Matt Says:

    Catchers are tough. I tried Bruce Benedict and couldn't get there. I thought that I had a path out, once I started finding guys who were C/1B, but it never went anywhere.

  8. Jimbo Says:

    Alfredo Griffin
    Don Kessinger
    Phil Rizutto
    Jose Offerman
    Adam Kennedy
    Julio Lugo
    Jeff Blauser
    Juan Uribe
    Jose Hernandez
    Davey Johnson (was in the league 14 years and had 136 homers, but once had 43 in a year!!!)
    Charlie Hayes
    Juan Encarnacion
    Jaques Jones
    Casey Blake
    Preston Wilson
    Pete Incaviglia
    Jose Cruz
    Jesse Barfield
    Dean Palmer
    Jay Buhner
    Troy Glaus
    Adam Dunn
    Ralph Kiner
    Albert Belle
    Hank Greenberg
    Albert Pujols
    Jason Giambi
    Carlos Delgado
    Jeff Bagwell
    Frank Thomas
    Jimmie Foxx
    Lou Gehrig
    Babe Ruth

    33 links.

  9. Jimbo Says:

    @6

    I think you are mixing columns or something. There isn't a direct similarity link from Derek Jeter to Pete Rose.

    Is there a way to edit posts here? I could've cut Albert Belle out of my Griffin-Ruth sequence lol.

  10. Lascor Says:

    @9

    Yes there is. Similar Batters Through 36.

  11. CRTYonker Says:

    Great Time Waster. 15 links from Babe Ruth to Matt Stairs

    Babe Ruth
    Frank Robinson
    Al Kaline
    Billy Williams
    Jim Rice
    Joe Carter (1st non-HOF)
    Gary Gaetti
    Brian Downing
    Sal Bando
    Larry Parrish
    Torii Hunter
    George Bell
    Ben Oglivie
    Jeff Burroughs
    Matt Stairs

    Curiously, it doesn't work the other way. For example, Larry Parrish is in the Similarity Score list for Sal Bando but it isn't reciprocated. Also, you need Burroughs to get from Oglivie to Stairs though Oglivie is on Stair's Similarity Score list.

  12. Andy Says:

    @ #11 Indeed going backwards from Ruth is a very different game, since HIS 10 most similar players are not necessarily the same players for whom Ruth shows up on their lists. There are lots of cases of guys who do not show up on each other's lists.

    @ #10 That's also a different game, using most similar batters instead of just overall similarity scores.

  13. dukeofflatbush Says:

    I think a better strategy in 'getting' to Ruth, is to try and find dead ball batters who may have pitched a bit, and to look for an age similarity score.

  14. DavidRF Says:

    The game is a bit harder if you limit yourself to career comps. In the Lemaster example, the step from Gallego to Giles leveraged that fact that those two were similar at age 27 (both got late starts in MLB).

    Using the age comps is sorta cheating in my opinion. Jumping from Luis Rivas to Paul Molitor or Rod Carew is not really in the spirit of the game. My two cents.

  15. DavidRF Says:

    Oops... wrong Brian Giles! Still... my point still holds.

    Got to be a better way to go from Gallego besides wrong-Brian-Giles. Wrong-Brian-Giles is a big step backwards!

  16. Lawrence Azrin Says:

    Yes, I've played this time-waster before, and the best way to get to Babe Ruth is just look for the player in the "Ten most similar" who has the highest home run total, that'll lead you to Ruth eventually. Problem is, you have a vague idea of that for players with long careers, but for more obscure guys it's close to random chance.

    For example, I tried this with Ray Oyler, one of the most infamous "good-field, no hit" guys in the last 50 years, but had to give up, since it was literally guesswork, and I didn't want to "cheat" and look up HR totals of possible candidates.

  17. Paul Says:

    Took me 55 links to get from Rey Ordonez to Babe Ruth:

    Rey Ordonez
    Jose Uribe
    Marty Perez
    Billy Ripken
    Nick Punto
    Jeff Reboulet
    Chuck Hiller
    Roberto Pena
    Pat Listach
    Carlos Febles
    Ken Aspormante
    Bret Barberie
    Aki Iwamura
    Alberto Callaspo
    Jeff Keppinger
    Asdrubal Cabrera
    Alexei Ramirez
    Ryan Doumit
    Stephen Drew
    Tulow
    Pedroia
    Kinsler
    Hill, A
    Phillips, B
    Sabo
    Hammonds
    Grieve
    Wilkerson
    Swisher
    Hidalgo
    Hill, Glenallen
    Davis, G
    Morneau
    Bay
    Horner
    Rosen
    Wright
    Holliday
    Cabrera, M
    Teixeira
    Sauer, H
    Buhner
    Strawberry
    Kiner
    Belle
    Greenberg
    Pujols
    Giambi
    Delgado
    Stargell
    McCovey
    Thomas. F
    Foxx
    Gehrig
    Ruth

  18. John Autin Says:

    A curious sequence climbing the LeMaster-to-Ruth ladder highlights a personal obsession:

    12. Bill Mazeroski (HOF 2B)
    13. Roberto Alomar (presumptive future HOF 2B)
    14. Ryne Sandberg (HOF 2B)
    15. Lou Whitaker
    16. Joe Morgan (HOF 2B)
    17. Craig Biggio (HOF 2B)

    Yeah, I know it's a game, not a science -- but there's still something wrong with that picture....

  19. Tristan Says:

    You can't get there from Cesar Crespo. Then again, he was apparently uncomparable...

  20. Tony Says:

    Maybe Casey Close already did this and therefore knew that you can get from Jeter to Ruth in no more than 10 steps, since Sandberg is in Jeter's top 10.

  21. gcm Says:

    glad to see i'm not the only one who does this from time to time.

    i'll add that i've discovered that there are a ton of no hit backup catchers out there that i've never heard of, and if you are playing this game and get into a pool of them, it is hard to get out.

    also, i don't think game can be done for pitchers with cy young or walter johnson as the target, if you stick to career comps only. i think greg maddux broke the link by being too similar to warren spahn.

  22. DavidRF Says:

    Using age-comps...

    Jay Bruce
    Barry Bonds
    Babe Ruth

    ... that should open a few back doors.

  23. Paul Says:

    @Tristan --- LOL, ya... you actually have to have comps for it to work!

  24. sansho1 Says:

    The Holy Grail would have to be Bill Bergen to Ruth, wouldn't it? I hope someone else tries this before I have to....

  25. Anon Says:

    I have drifted through b-ref this way before but never tried to link from one player to another - cool idea!

    I got Juan Pierre:

    Pierre
    Matty Alou
    Enos Cabell
    Steve Finley
    Chili Davis
    Dave Winfield
    Frank Robinson
    Hank Aaron
    Babe Ruth

  26. Paul Says:

    @Anon - that was surprisingly easy for Pierre.

  27. Anon Says:

    Here's 1 route from Bill Bergen to the Babe:

    Bergen
    Gabby Street
    Charley O'Leary
    Johnnie LeMaster

    I won't repeat the rest of the sequence from above but we know you can get from LeMaster to the Babe. . . .

  28. Anon Says:

    @ Paul #26 -

    "@Anon - that was surprisingly easy for Pierre."

    LH hitting Outfielder - how hard could it be? 🙂

  29. sansho1 Says:

    One Holy Grail coming right up! (seasonal shipping delays possible)

  30. DavidRF Says:

    With Charlie O'Leary in the mix, you can start playing games with *really* old comps.

    Bill Bergen
    Gabby Street
    Charley O'Leary
    Nick Altrock
    Minnie Minoso
    Jim O'Rourke
    Jesse Burkett
    Shoeless Joe Jackson
    Stan Musial
    Hank Aaron
    Babe Ruth

  31. dukeofflatbush Says:

    How about Archibald 'Moonlight' Graham?
    Not bloody likely!

  32. Geni Says:

    I managed to connect Nolan Ryan with Babe Ruth using just top 10 similarity scores...took 22, and hey! Jamie Moyer!

    Nolan Ryan
    Warren Spahn
    Tommy John
    Burleigh Grimes
    Dennis Martinez
    Jamie Moyer
    Red Ruffing
    Sam Sam Jones
    Earl Whitehill
    Hooks Dauss
    Wilbur Cooper
    Stan Coveleski
    Babe Adams
    Urban Shocker
    Eddie Rommel
    Rip Sewell
    Pat Malone
    Vick Raschi
    Sal Maglie
    Jeff Tesreau
    Carl Lundgren
    Babe Ruth

  33. DavidRF Says:

    @31
    You have to play a certain number of games before bb-ref will post a set of sim scores for you apparently. So no starting point for Moonlight Graham. I found that out while hoping to do John Paciorek.

  34. CRTYonker Says:

    Still seems like "cheating" to use the Similarity By Age list.

  35. Tmckelv Says:

    @18,

    I noticed the same thing about Whitaker.

  36. Lawrence Azrin Says:

    #4/ CRTYonker Says: "Still seems like "cheating" to use the Similarity By Age list."

    Agreed, to quote from the intro "...you start plotting a strategy whereby you click through the Top 10 Similarity Scores shown...". This is different than the Most Similar By Age listing on the right-hand side. I noticed this when trying to duplicate Juan Pierre-to-Babe Ruth in #25.

  37. Wine Curmudgeon Says:

    Glad to see someone got Rey Ordonez. I'm stuck on Mick Kelleher.

  38. DavidRF Says:

    Mick Kelleher
    Nate Oliver
    Larry Milbourne
    Mike Gallego
    ... which gets you to the Johnnie LeMaster.

    ("cheating" with using the age-comps column of course)

  39. Wine Curmudgeon Says:

    Hey, DavidRF, my curse is that I'm Cubs fan.. so you know I'm not going to take any short cuts. And how depressing is it that Kelleher and Nate Oliver played for the Cubs?

  40. DavidRF Says:

    No shortcuts?

    Its much harder using just the career comps list, but I think I just did it.

    Mick Kelleher
    Jack Heidemann
    Jim Anderson
    Ted Kazanski
    John Boccabella
    Todd Cruz
    Pedro Garcia
    Jerry Kindall
    Dave Roberts
    Bobby Morgan
    ted Lepcio
    Gene Oliver
    John Wockenfuss
    Morgan Ensberg
    Jose Bautista
    Nick Esasky
    Bo Jackson
    Ron Kittle
    Jim Gentile
    Glenn Davis
    Dick Stuart
    Tony Clark
    Cecil Fielder
    Jeromy Burnitz
    Darryl Strawberry
    Ralph Kiner
    Albert Belle
    Juan Gonzalez
    Jose Canseco
    Willie Stargell
    Willie McCovey
    Harmon Killebrew
    Sammy Sosa
    Ken Griffey
    Willie Mays
    Hank Aaron
    Babe Ruth

    .... whew...

    The strategy was just try to maximize HR at every step and occasionally trying someone else if I hit a dead end. It was very slow going until I hit Morgan Ensberg. Probably some shortcuts to be had in there. I figured it would go faster after I hit McCovey. Someone could probably improve on that.

  41. DavidRF Says:

    Yeah, its probably faster using infielders to jump from 200 to 500 HR. The positional adjustment makes it easy to make some pretty large leaps. Hence the Vern Stephens comment in the original post.

    I'm supposed to be working though. Maybe I'll try again later.

  42. Paul Says:

    What about taking some slight power, no speed guys through the chain to Rickey Henderson? Maybe some of those catchers discussed, someone like a Greg Dobbs or Kevin Maas.

  43. Robert Says:

    Took me 60 to get from 2 career HR Rafael Belliard to Babe Ruth:

    Rafael Belliard
    John Sullivan
    Enzo Hernandez
    Bob Lillis
    Ruben Amaro Sr.
    Ted Kubiak
    Curtis Wilkerson
    Abraham Nunez
    Billy Ripken

    ...and then follow everyone after Billy Ripken in #17's post...

  44. Jimbo Says:

    Kevin Maas was a first baseman. I always remember the start of his career. If you look in his career log you will see for his rookie year he had 13 home runs in his first 113 PA's! Around that time he hit his 15 th homer against my Blue Jays, and also in that series hit a massive moonshot that looked to be a monster homer but ended up settling down on the warning track. I thought he was gonna be the next Babe Ruth.

    He hammered the ball for a .902 OPS that year, and looked to be a big up and comer for the Yankees. He never hit well again though.

  45. Jeff J. Says:

    @16

    Good strategy. I wonder what the Dave Eggler to Babe Ruth chain would look like.

  46. barkfart Says:

    how comforting to know that I'm not the only one wasting endless hours in this way.

    I HAVE NO REGRETS!!

  47. TonyBaloney Says:

    #42 - I don't think Rickey shows up on anyone's top 10 similarity scores (though I like the idea...)

  48. Doug A Says:

    The similarity line can be very frustrating! Who could've imagined how long it would take to go from Willie Mays Aiken to Willie Mays! Once I got to Dave Parker, I knew it was coming soon - Parker to Dawson to Winfield to Robinson to Mays. Willie Aiken was born two and a half weeks after The Catch. I wonder how many other players were named after major leaguers (not counting families, e.g. Griffey).

  49. John Autin Says:

    @48, Doug -- Larry Doby Johnson got to play for both the Indians and the White Sox, appropriately.

  50. TonyBaloney Says:

    I got from Griffey Sr. to Griffey Jr. via Cruz Sr, Willie Davis, Pinson, Parker, Dawson, and Winfield.

    Still not as good as Bill Swift appearing on Bill Swift's top 10 similarity scores (and vice versa)...

  51. The DGs Says:

    Here's a challenge, go from Wally Pipp to Lou Gehrig.
    I don't have the attention span to try it.

  52. Paul Says:

    @51.

    1 Wally Pipp
    2 Frank Schulte
    3 Curt Flood
    4 Jim Piersall
    5 Enos Cabell
    6 Mickey Rivers
    7 Matty Alou
    8 Juan Pierre
    9 Steve Brodie
    10 Lance Johnson
    11 Mookie Wilson
    12 Dan Gladden
    13 Chad Curtis
    14 Lee Mazzilli
    15 Gary Matthews Sr.
    16 Scott Spiezio
    17 Candy Maldanado
    18 Pedro Feliz
    19 Ed Sprague
    20 Scott Brosius
    21 Ty Wigginton
    22 Hank Blalock
    23 Michael Cuddyer
    24 Ben Grieve
    25 Brad Wilkerson
    26 Nick Swisher
    27 Paul Sorrento
    28 Glenn Davis
    29 Justin Morneau
    30 Jason Bay
    31 Al Rosen
    32 David Wright
    33 Matt Holliday
    34 Miguel Cabrera
    35 Mark Teixeira
    36 Hank Sauer
    37 Jay Buhner
    38 Darryl Strawberry
    39 Ralph Kiner
    40 Albert Belle
    41 Hank Greenberg
    42 Albert Pujols
    43 Jason Giambi
    44 Carlos Delgado
    45 Willie Stargell
    46 Willie McCovey
    47 Frank Thomas
    48 Jimmie Foxx
    49 Lou Gehrig

  53. Paul Says:

    Another fun one is getting from CJ Wilson to Cliff Lee as Wilson will replace Lee as the Rangers ace. I'll let you do Colby Lewis if you'd rather. Either could become the #1 and both will prove rather tricky. Good luck.

  54. Jeff J. Says:

    @48

    Some guy named Mickey Mantle, wonder if he turned out any good?

  55. Jeff J. Says:

    @42

    What about Russ Nixon?

  56. Jeff J. Says:

    @48 "Once I got to Dave Parker, I knew it was coming soon - Parker to Dawson to Winfield to Robinson to Mays. Willie Aiken was born two and a half weeks after The Catch. "

    Would have been interesting if Winfield would have been a Giant, considering when he was born.

  57. TonyBaloney Says:

    Is the path from Rick Ankiel to Babe Ruth quicker as a hitter, or as a pitcher???

  58. Wine Curmudgeon Says:

    Thanks, DavidRF....Your effort is awe inspiring.

  59. Doug A Says:

    @ 54 oh yeah, Mantle and Cochrane. I once had a friend who was a big Giants fan and whose last name was Willey. He insisted that he was going to name his daughters Mays and McCovey. He caved and gave them boring ordinary names though. Then his wife dumped him anyway. He should have stood his ground!

  60. Raker Says:

    I've played this game for years. It started for me when I tried to click from Babe Ruth to Horace Clarke. I got my brother playing this also.

    The interesting thing about this game is Babe Ruth is kind of a pivot point for all players because of his great pitching numbers. As it should be, The Babe is the connecting tissue for baseball's history.

    I usually try to make connections by using only 5 clicks. For example, The Babe is only 2 clicks from David Segui and just 3 from Jay Payton! Also, because of the Babe, guys like Frank Robinson are only 3 or 4 clicks from guys like Dwight Gooden and Bob Welch.

  61. Michael E Sullivan Says:

    There was once a proof in a similar game that the number would not be greater than 6. Paul Erdos was a mathematician who published papers with a very wide variety of different colleagues. Someone (maybe Erdos himself, I don't recall) suggested that one could count a mathematician's (or anybody in related fields who published with mathematicians) Erdos number by how many people you had to go through to get to Erdos, where you can link through anyone you have published a paper with.

    I believe that it was proven that all finite Erdos numbers were <=6 or some such fairly small number. There were a number of people who could not link to Erdos in this way, but everyone who did at all, could be linked within 6 or 7 steps.

    This is related to the 6 degrees of separation concept, where it is astronomically unlikely that any given pair of people anywhere in the world cannot be linked in <= 6 steps through people where each link is between two people who personally know each other.

    I would think similarity scores is a tougher nut, because we've artificially limited the space to top 10, while almost all people know many more than 10 people, and a fair number of mathematicians (for instance, Erdos) have co published with far more than 10 other authors.

    But that said, 10^6 is around 1 million, which massively overwhelms the sample space of major league baseball players. So I'd hardly be surprised to discover that the largest defined Ruth number is fairly low. A lot of that meandering around with weak players until you find the key guy with decent stats can probably be cut to 2-3 links if we do a real analysis of the graph, and find those weak players who happen to link well.

    Players with highly unusual stats should generally represent good link points in this game. In order to jump dramatically in quality, you need people who are much better than you to be more similar to you than most of the players who are about as good. If a player has highly unusual stats, there is a good chance that any player who is unusual in the same way will show up on their similarity list, even if they were much better or worse as a player.

  62. Andy Says:

    Pursuant to Michael's excellent comment at #61 and #1 about the Oracle of Bacon, we have here the Oracle of Baseball:

    http://www.baseball-reference.com/oracle/

    Where you can connect teammates. Just like Michael says, it rarely takes more than 6 links to connect any two baseball players.

  63. wendell Says:

    I got from Rafael Belliard to Ruth in 45 clicks
    Belliard hit a grand total of 2 HR but still played for 17 seasons.

  64. Jeff J. Says:

    @57

    What would be cool is linking from Babe Ruth first as a pitcher/hitter, through someone like Monte Ward, and then back to Babe Ruth as a hitter/pitcher (the one that you didn't start with)

  65. Jeff J. Says:

    @61

    17K big leaguers is also not 6 BILLION people

  66. Pat Lynch Says:

    Hehe! This is so funny, I've been doing this for years! I do sometimes cheat with ranked by age, but I swear I can get there faster than most of the others. Try some other games like trying to get from Ichiro to Sandberg, it's a blast!

  67. Joseph Says:

    Hey! This is a fun game. But I gave up on Enzo Hernandez after about 100 clicks--I did get it to Billy Ripken, who seems to be on most lists posted above-- thought that might be the way, but I just couldn't find it.

  68. Most similar players traded for each other » Baseball-Reference Blog » Blog Archive Says:

    [...] on to our fun with Similarity Scores, reader Rick Jennings wrote in with the following trivia [...]

  69. Jesse R. Says:

    I wanted to connect Carlos Quintana to Babe Ruth, but I accidentally switched to pitchers when I clicked on Al Orth. I'm not sure if oscillating between similar pitchers and similar batters is within the rules, but I did stay away from the "Similarity By Age" lists; I know DavidRF considers that cheating. I was happy I included Terry Francona, too (who, apparently, hit about as well as dead ball era pitchers hit).

    1. Carlos Quintana
    2. Terry Francona
    3. Al Orth
    4. Red Donahue
    5. Red Ehret
    6. Jimmy Ring
    7. Sid Hudson
    8. Slim Harriss
    9. Max Butcher
    10. Jim Tobin
    11. Ed Willett
    12. Phil Douglas
    13. Bob Rhoads
    14. Charlie Ferguson
    15. Carl Lundgren
    16. Babe Ruth