{ Shamko.txt: The Shamkovich Benchmark for NL parsing and chess data extraction Revised: 1994.01.16 By Steven J. Edwards (sje@world.std.com) This document contains the text of the Shamkovich Benchmark. This benchmark is intended for use by NL (natural language) parsing researchers and others interested in the automated extraction of chess data from text articles. The article used for the benchmark is taken from page 27 of the 1994.02 issue of _Chess Life_ published by the USCF (United States Chess Federation). The author of the article is International Grandmaster Leonid Shamkovich. The text has been selected as an example of a typical English language article with expert level annotation of a chess game. The magazine is copyrighted; this data is reproduced under the Fair Use clause of the United States Copyright Law that allows a limited exception for "scientific, literary, or educational" usage. This is important to note this condition as the annotation (i.e., commentary) portion of an annotated chess game is in general not redistributable without prior arrangements. The sole purpose of this document is to provide a standard, commonly available text for NL researchers. The goal is to parse, via a program, as much of the article as possible and to demonstate the completeness of the parse by comparing it to the goal text that gives the actual chess moves in a simple, clear format. A truly useful parsing algorithm will also be able to structure the annotation in a machine readable format as well. There are two parts to the benchmark. The first part is the article and the second part is the goal text. The article is taken directly from the printed version with all annotative text preserved, but with diagrams and diagram captions removed; these do not carry any additional semantic information. The editor's introductory comments are also removed. Bold and italic typeface attributes have been folded into regular text as would be the case if the paper copy were processed by a mechanical scanner of moderate ability. Letter case is preserved as is the use of punctuation that is traditional to chess articles. Two leading spaces on a text line indicate the start of a paragraph. Line breaks are reproduced as they appear in the original. There are no blank lines. The goal text is the chess game reproduced using PGN (Portable Game Notation), a non-proprietary standard for representation of chess game data using ASCII characters. PGN games can be read and written directly by a variety of chess software. ******* BENCHMARK } ------------------------------------------------------- SICILIAN DEFENCE [B88] Sozin Variation W: Aarne Hermlin B: GM Leonid Shamkovich Vilyandi, 1972 ------------------------------------------------------- 1. e4 c5 2. Nf3 Nc6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 d6 6. Bc4 e6 7. Bb3 Be7 8. Be3 a6 9. f4 Nxd4 10. Bxd4 O-O 11. Qf3 b5!? (diagram) "What is this move?" asked my opponent, the famous Estonian theore- tician, after the game. "I have never seen it." Well, it was just a successful improvisation -- a theoretical novelty. 12. e5 After the game I found another attractive try, 12. Bxf6!?, with the forced variation 12. ... Bxf6 13. e5 Bh4+ 14. g3 Rb8 15. gxh4 Bb7; and as I pointed out in later comments to this game, "After 16. Qf1 (or 16. Ne4 dxe5 with advantage) 16. ... Bxh1 17. Qxh1 Qxh4+ apparently favors Black." * This line appeared, with some transposition, in the 12th game of the Kasparov-Short match. The Briton maintained the balance after 16. Ne4 dxe5 17. Rg1! g6 18. Rd1 Bxe4 19. Qxe4 Qxh4+ 20. Ke2 Qxf4, but 20. ... exf4, keeping queens on the board, seems to me to be more promising for Black. Kasparov, as far as I know, has the same opinion. There is an interesting third alternative in the Ex- change sacrifice: 16. Qg3!? Bxh1 17. O-O-O Bb7 18. Rxd6 Qe7 with double-edged play. More solid is 12. a3 Bb7 13. O-O-O, but 13. ... Rc8 (not 13. ... a5? 14. Bxf6 Bxf6 15. Nxb5 a4 16. Rxd6, which was quite good for White in Istratescu- Buturin (Bucharest 1992 -- see Informant 54 for the full game) gives Black equal chances. The text, 12. e5, led to very sharp and instructive play. 12. ... dxe5 13. fxe5? 13. Qxa8 is good for Black, but 13. Bxe5 is not so clear. Here is an approx- imate analysis of this important line: 13. ... Qb6! 14. Qxa8 Bb7 15. Bd4! (the rescue action) 15. ... Qc6 (not 15. ... Qc7 because of 16. Qa7 Nd7 17, Bd5!, winning) 16. Qa7 Nd7 (White is a rook ahead, but 17. ... Ra8 is threatened) 17. Bf2 (17. Be3? Bh4+! is very strong) 17. ... Nc5! 18. O-O-O Ra8, and now 19. Nd5 (or 19. Bd5 Qc7! wins) 19. ... exd5 20. Bxd5 Qf6, and White's queen is lost ingloriously. 13. ... Qxd4 14. exf6 If 14. Qxa8 Qxe5+, and Black has excellent com- pensation for the Exchange. 14. ... Bc5! 15. fxg7 The rook is still immune, as 15. Qxa8? Qf2+ 16. Kd1 Be3! wins right away. 15. ... Rd8! 16. Rd1 Qe5+ 17. Qe4 Rxd1+ 18. Kxd1 Qxe4 19. Nxe4 Be7 Black has a clear plus, based on the strong bishop pair. 20. Nd2? The wrong maneuver. Correct is 20. Re1 Bb7 21. c3, which only slightly favors Black. 20. ... Bb7 21. Nf3 a5! 22. a4? Preventing the threat 22. ... a4, but allowing a blockade of the queenside. However, 22. a3 a4 23. Ba2 b4! as well as 22. c3 a4 23. Bc2 a3! also favors Black. 22. ... Rd8+ 23. Kc1 b4 24. Rd1 24. Bc4 doesn't help because of 24. ... Bd6! 24. ... Rxd1+ 25. Kxd1 Bf6 26. Kc1 (dia- gram) 26. ... Bc6!. Paralyzing White's bishop. The game is won. 27. Kd2 A desparate attempt to win the a5-pawn with a king march. Another try is 27. Ne1, but it can be met by 27. ... Kxg7 28. g3 Bd4 29. Nd3 e5! with a won game for Black. The passed e-pawn will play a decisive role in the finish. 27. ... Bxb2 28. Kd3 Bxg7 29. Kc4 Bc3! 30. Kc5 Be4 31. Nd4 After 31. Kb5 e5, Black's a-pawn is "poisoned" because of 32. Kxa5 Bxc2! 33. Bxc2 b3+, winning. 31. ... Bxg2 32. Nc6 Bxc6 33. Kxc6 Now the opposite-colored bishops cannot rescue White, as the passed e- and f-pawns are too strong. 33. ... e5 34. Kb5 e4 35. Kxa5 e3, White resigns. In view of the winning idea, 36. Bc4 b3+ ------------------------------------------------------- * From GM Shamkovich's The Chess Terrorist's Hand- book, soon to be published. ******* GOAL TEXT [Event "?"] [Site "Vilyandi"] [Date "1972.??.??"] [Round "?"] [White "Hermlin, Aarne"] [Black "Shamkovich, Leonid"] [Result "0-1"] 1. e4 c5 2. Nf3 Nc6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 d6 6. Bc4 e6 7. Bb3 Be7 8. Be3 a6 9. f4 Nxd4 10. Bxd4 O-O 11. Qf3 b5 12. e5 dxe5 13. fxe5 Qxd4 14. exf6 Bc5 15. fxg7 Rd8 16. Rd1 Qe5+ 17. Qe4 Rxd1+ 18. Kxd1 Qxe4 19. Nxe4 Be7 20. Nd2 Bb7 21. Nf3 a5 22. a4 Rd8+ 23. Kc1 b4 24. Rd1 Rxd1+ 25. Kxd1 Bf6 26. Kc1 Bc6 27. Kd2 Bxb2 28. Kd3 Bxg7 29. Kc4 Bc3 30. Kc5 Be4 31. Nd4 Bxg2 32. Nc6 Bxc6 33. Kxc6 e5 34. Kb5 e4 35. Kxa5 e3 0-1 Shamko.txt: EOF