"Learning" Tic Taco Toe / Computer Science / Forums

Welcome

Online

Members Birthdays:

All today's birthdays', congrats!
Thomas (40)
VulcanBB18 (50)

Next birthdays
07/19 Thomas (40)
07/19 VulcanBB18 (50)
07/20 rev (49)

Contact

Support 4hv.org!

Donate:
4hv.org is hosted on a dedicated server. Unfortunately, this server costs and we rely on the help of site members to keep 4hv.org running. Please consider donating. We will place your name on the thanks list and you'll be helping to keep 4hv.org alive and free for everyone. Members whose names appear in red bold have donated recently. Green bold denotes those who have recently donated to keep the server carbon neutral.
Special Thanks To:

Aaron Holmes
Aaron Wheeler
Adam Horden
Alan Scrimgeour
Andre
Andrew Haynes
Anonymous000
asabase
Austin Weil
barney
Barry
Bert Hickman
Bill Kukowski
Blitzorn
Brandon Paradelas
Bruce Bowling
BubeeMike
Byong Park
Cesiumsponge
Chris F.
Chris Hooper
Corey Worthington
Derek Woodroffe
Dalus
Dan Strother
Daniel Davis
Daniel Uhrenholt
datasheetarchive
Dave Billington
Dave Marshall
David F.
Dennis Rogers
drelectrix
Dr. John Gudenas
Dr. Spark
E.TexasTesla
eastvoltresearch
Eirik Taylor
Erik Dyakov
Erlend^SE
Finn Hammer
Firebug24k
GalliumMan
Gary Peterson
George Slade
GhostNull
Gordon Mcknight
Graham Armitage
Grant
GreySoul
Henry H
IamSmooth
In memory of Leo Powning
Jacob Cash
James Howells
James Pawson
Jeff Greenfield
Jeff Thomas
Jesse Frost
Jim Mitchell
jlr134
Joe Mastroianni
John Forcina
John Oberg
John Willcutt
Jon Newcomb
klugesmith
Leslie Wright
Lutz Hoffman
Mads Barnkob
Martin King
Mats Karlsson
Matt Gibson
Matthew Guidry
mbd
Michael D'Angelo
Mikkel
mileswaldron
mister_rf
Neil Foster
Nick de Smith
Nick Soroka
nicklenorp
Nik
Norman Stanley
Patrick Coleman
Paul Brodie
Paul Jordan
Paul Montgomery
Ped
Peter Krogen
Peter Terren
PhilGood
Richard Feldman
Robert Bush
Royce Bailey
Scott Fusare
Scott Newman
smiffy
Stella
Steven Busic
Steve Conner
Steve Jones
Steve Ward
Sulaiman
Thomas Coyle
Thomas A. Wallace
Thomas W
Timo
Torch
Ulf Jonsson
vasil
Vaxian
vladi mazzilli
wastehl
Weston
William Kim
William N.
William Stehl
Wesley Venis

The aforementioned have contributed financially to the continuing triumph of 4hv.org. They are deserving of my most heartfelt thanks.

Forums

4hv.org :: Forums :: Computer Science

« Previous topic | Next topic »

"Learning" Tic Taco Toe

Move Thread

LAN_403

rp181

Sat Mar 08 2008, 01:41AM

Registered Member #1062 Joined: Tue Oct 16 2007, 02:01AM
Location:
Posts: 1529

I recently got interested in "learning" programs, so im going to try to make my own learning program in Java for tic tac toe. This is the method i have.

The normal tic tac toe program, but it records all the moves you make, and it makes.
At the end of the game if it (computer);
Looses:
Subtracts 1 "point" from the positions it moved in an array
Adds 1 "point" to whatever the player did
Wins:
Adds 1 point to what it moved
Subtract 1 point to what the player moved

The points will have a maximum, say 100, and start at the midpoint [50]. Minimum will be zero.
Im thinking the more games it plays, itle "realize" the best place to move, by increasing the probability of the program to choose the position on the board.

I know this wont work for different player very well, with different playing styles and all, and the tic tac toe tricks some people use.

Are there any big flaws anyone sees?

Bjørn

Sat Mar 08 2008, 02:59AM

Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058

Is this on a 3x3 grid?

If I understand your explanation correctly you will give each position in the grid a value. So the computer will play the position with the highest value that is still free.

The problem is that the value of a position depends on the previous moves in the game and not on the average of all games. Since the computer will not use the current state of the grid to assign a value to each position it will not be able to block the opponent and the opponent will have a very easy game.

This strategy will work in Reversi to a small degree since different positions have a very different value (a corner can't be reversed so it is very valuable). In this case I am not sure it will do much good.

rp181

Sat Mar 08 2008, 02:05PM

Registered Member #1062 Joined: Tue Oct 16 2007, 02:01AM
Location:
Posts: 1529

yes, its a 3x3 grid. Im getting a book on artificial neural networks in java, so ile wait for that. I was thinking to have it think to check for wins that the player might have in the future(eg. 2 X in row). If it doesn't then it applies the stratagy.

Bjørn

Sat Mar 08 2008, 10:15PM

Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058

Let us think about it a little, there are 9 squares with 3 possible states each, so the total number of states of the board is 3^9 = 19 683.

There are at most 9 possible places to move, that fits in 4 bits. So a table of 9842 bytes can store every possible state of the board and the best move for each state. If we account for different symmetries then the table becomes a lot smaller.

Then there is a matter of finding out what is the optimal move for a given state. You could use your first method only now you have an array for each possible state and let it learn as it plays, or you can just play all possible games that can result from each possible state once and for all.

When the game starts there will be 9 free positions, one the next move there will be 8 and so on. So there should be 9*8*...*1 = 9! = 362 880 different games to play. So filling the table using brute force should take at most (362 880 * 19 683) = 7 142 567 040 games.

The real number will be considerably smaller since the games does not run from the start and many of them will end before each position on the board is used. Some of the 19 683 states also describe games that have already ended. It should not take long to run through all possible combinations.

Alex

Wed Mar 12 2008, 12:38AM

Geometrically Frustrated
Registered Member #6 Joined: Thu Feb 02 2006, 04:18AM
Location: Bowdoin, Maine
Posts: 373

I don't know a whole lot about computer algorithms, but the first thing that came to my mind was the Markov chain. Perhaps you could use a high order Markov chain where only 'winning' moves are stored. That might end up being a large amount of data, and you'd need to give it some way to decide what to do in situations it hasn't seen before, so it's probably not the best system. Something to think about, anyway.

rp181

Wed Mar 12 2008, 01:23AM

Registered Member #1062 Joined: Tue Oct 16 2007, 02:01AM
Location:
Posts: 1529

Ile look up the markov chain =)

Moderator(s): Chris Russell, Noelle, Alex, Tesladownunder, Dave Marshall, Dave Billington, Bjørn, Steve Conner, Wolfram, Kizmo, Mads Barnkob

Legal Information