lichess.org
Donate

List of Pawn Structures (for Deep Learning)

Hi,

I am trying to make the computer to explain its evaluation.

For that Pawn Structure is an important feature of a given position.
I found that on Wikipedia there are a few setups shown, which I can use.

en.wikipedia.org/wiki/Pawn_structure

Is there any database of different pawn structures that result from the main openings?
Kmoch and Nimzowitsch certainly give them animal names. In fact "My System" is a bit like the animal farm of chess.
Is there any cross-sectional view that way of the many bushes of openings that could allow grouping based on position information? Deep leaning is using that as it is its strength: extract similarity based features or use them if already in the encoding (and not too much contortions needed to learn to extract useful ones from those human choices of position information features).

If there are already attempts in the past that could have been but into "database" or more humbly, sets of positions. or even crafted exemplars (as in the wikipedia page, or probably the books that are the source by human chess theory authors I think of a name Soltis, but that is name dropping coming from me).

And main openings. You may be opening some can. Main? as in popular, or historically well travelled and systematically explored to deep confines? I would also like to know. So I second your question starter, but would suggest accepting some smaller scope answers..
#1 From that wikipedia link...

"For a formation to fall into a particular category, it need not have a pawn position identical to the corresponding diagram, but only close enough that the character of the game and the major themes are unchanged. It is typically the center pawns whose position influences the nature of the game the most."

That first sentence is a big issue in making a program be able to say what category, if any, the position is. The second sentence is probably why Kotov in "Think Like a Grandmaster " just considered center pawns in his categories.

Don't forget about reverse openings.
That are all great points.

I found a solution, I hope it kinda works,

I iterate all games/positions I have in my database. Then create a fen representation which only includes the pawns.
I order the fens by how often they occur and select the top 100 (maybe top 200).

The reasoning is if a pawn setup does not occure often, it is not important to know. Which is kinda okay for deep learning. Of course it is not 100% true, but then the scope would blow up.
I would objet to Deep learning only affected by frequency. It looks at "cross-correlation" frequencies. not just the most popular but those internal configuration relationships that are linked to the target vector information as training feedback from teaching environment. I am not saying that your idea is wrong, but that having lots with little information of the cross-correlation in light of target, will not help. That is my digest of what a NN network feedforward layered architecture does. This is from experience with smaller but still hidden layer networks looking closely at what it was doing compared to frequency base profiling statistics.
in bioinformatics context (but that does not matter, once at the mathematical formulation level, not really).

This topic has been archived and can no longer be replied to.