Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRASR strategy #1051

Open
drvinceknight opened this issue Jun 22, 2017 · 21 comments
Open

GRASR strategy #1051

drvinceknight opened this issue Jun 22, 2017 · 21 comments

Comments

@drvinceknight
Copy link
Member

drvinceknight commented Jun 22, 2017

EDIT: This issue is no longer about the strategy described below. It now corresponds to an implementation of a Fortran strategy.

This strategy is one of the strategies from Axelrod's original tournaments. It is described here http://axelrod.readthedocs.io/en/stable/reference/overview_of_strategies.html#graaskamp as:

1. Play Tit For Tat for the first 50 rounds;
2. Defects on round 51;
3. Plays 5 further rounds of Tit For Tat;
4. A check is then made to see if the opponent is playing randomly in which case it defects for the rest of the game;
5. The strategy also checks to see if the opponent is playing Tit For Tat or another strategy from a preliminary tournament called ‘Analogy’. If so it plays Tit For Tat. If not it cooperates and randomly defects every 5 to 15 moves.

Step 4 can be implemented using a chi squared test similarly to the Stein and Rapaport strategy which is implemented here: https://github.com/Axelrod-Python/Axelrod/blob/master/axelrod/strategies/axelrod_first.py#L496

Step 5 is a bit tricky as it refers to a strategy called "Analogy". I have not been able to find any information describing this strategy. I suggest that anyone who tries to work on this does their best to find that information but if there is non available let us simply ignore that part of the check but clearly point that out in the documentation of the strategy.

(Other strategies that we'd like to implement are available here: #379)

@drvinceknight drvinceknight changed the title Specific strategy request: Graaskamp Specific strategy: Graaskamp Jun 22, 2017
@SplitInfinity
Copy link

SplitInfinity commented Jun 23, 2017

I'm a newcomer and I'd like to work on this. I did some (admittedly superficial) research and couldn't find anything about the 'Analogy' strategy, so I'm going to follow your suggestion leave that unimplemented for now (with a note in the documentation of the strategy).

That being said, I have one question. This paper defines Graaskamp slightly differently:

Follows a tit-for-tat strategy for 50 rounds, defects on round 51, then plays 5
more rounds of tit for tat. A check is then made to see if the opponent is playing random
(in which case it switches to AllD), or playing tit for tat or analogy or is its own twin (in
which case it responds with tit for tat). Otherwise, it randomly defects every 5 to 15
rounds.

Should my implementation also play tit-for-tat (TFT) if it detects that the opponent is also using Graaskamp (this is my interpretation of "its own twin")? If so, what is the best way to detect Graaskamp? Is it sufficient to look for 50 rounds of TFT followed by defect followed by 5 more rounds of TFT?

@drvinceknight
Copy link
Member Author

I'm a newcomer and I'd like to work on this. I did some (admittedly superficial) research and couldn't find anything about the 'Analogy' strategy, so I'm going to follow your suggestion leave that unimplemented for now (with a note in the documentation of the strategy).

That's great @SplitInfinity!

I suggest going with both implementations. So one can be Graaskamp1980 and the other Graaskamp2001.

In terms of "playing it's own twin", you can see the histories of both so self.history == opponent.history in the strategy method would be a sufficient test for that.

I hope that helps :)

@drvinceknight
Copy link
Member Author

You might have already looked through the documentation but if not this is the specific section on contributing with guidance on writing a strategy: http://axelrod.readthedocs.io/en/stable/tutorials/contributing/index.html

👍

@marcharper
Copy link
Member

If you are willing to look through some Fortran, this strategy appears to be in the second Axelrod tournament as well. See the source code and look for a line "C TYPED BY AX, 3/27/79 (SAME AS ROUND ONE GRAASKAMP)" (there's another strategy "C BY JIM GRAASKAMP AND KEN KATZEN" that I think is different).

Here is some info on interpreting the Fortran functions.

@SplitInfinity
Copy link

@marcharper

How do you interpret the if statements in that Fortran code? For example, what is the meaning of:

IF (MOVEN - 51) 1, 2, 3

I assume from my knowledge of C that predicate is false only if MOVEN = 51 and true otherwise. But what does the 1,2,3 after the if clause mean?

@marcharper
Copy link
Member

If (...) < 0 go to 1, == 0 Goto 2, < 0 Goto 3. See 11.4 here.

@SplitInfinity
Copy link

SplitInfinity commented Jun 25, 2017

@marcharper

Thanks! I think I understand the implementation you linked on a statement/language level now but not yet on an algorithmic level. Should Graaskamp1980 be essentially a translation of the Fortran implementation you linked to, or should I implement what is currently described in the Axelrod documentation as the Graaskamp strategy?

@marcharper
Copy link
Member

Ideally these are the same -- hopefully the Fortran can help explain anything missing.

@SplitInfinity
Copy link

@marcharper

The one thing @drvinceknight and I couldn't figure out is how to detect if the opponent is using the 'Analogy' strategy. How does the Fortran code you linked do that?

@marcharper
Copy link
Member

I don't know -- I haven't taken the time to translate the Fortran code and we don't know what the analogy strategy is in any case. Probably one or more of the conditionals detects whatever analogy was doing...

@SplitInfinity
Copy link

@drvinceknight

If not it cooperates and randomly defects every 5 to 15 moves.

My interpretation of this is that the strategy class should take a constructor argument called random_defect_frequency that is between 5 and 15 inclusive, and random_choice() should be used to decide what to do in each round (after round 56) that is a multiple of random_defect_frequency. The defect probability should also be a constructor argument.

Another interpretation is that starting at round 57, the implementation should generate a random number r = randrange(5, 15) and defect (with 100% probability) in round 57 + r. r is regenerated in round 57 + r as well (or should it be in round 57+15 = 72?).

Which of these two makes the most sense?

@marcharper
Copy link
Member

The FORTRAN is below.

It looks like at some point (near the end) the strategy computes N = RANDO * 10.0 + 5.0 and counts down to N=0 at which time it defects; otherwise it plays TFT or cooperates depending on the opponent's moves. So it looks like the second interpretation is closer to the original.

Whether the opponent is playing analogy appears to be depending on some count of four things that meets some criteria. That may be difficult to mimic exactly, but you can try to decipher the code below for some clues.

    Integer FUNCTION GRASR(JPICK, MOVEN, ISCOR, JSCOR, RANDO,JA)
    DIMENSION NMOV(4)
    GRASR=JA        ! Added 7/32/93 to report own old value
c Next line for debugging
c   if(moven. eq. 57)  write(6,99) jscor
c99 format(' TEST from GRASR at move 57. jscor = ', i6)
      IF (MOVEN .NE. 1) GO TO 9997
    DO 9996 I = 1, 4
    NMOV(I) = 0
9996    CONTINUE
    NMOVE = 0
    IGAME = 0
    N = 0
9997    CONTINUE
    IF (MOVEN - 1) 25, 25, 26
25  GRASR = 0
    RETURN
26  IF (MOVEN - 51) 1, 2, 3
1   GRASR = JPICK
    RETURN
2   GRASR = 1
    RETURN
3   IF (MOVEN - 57) 4, 5, 6
4   IF (MOVEN - 52) 9, 9, 10
10  NMOV(MOVEN - 52) = MMOVE + JPICK
9   GRASR = JPICK
    IF (GRASR -1) 7, 8, 8
7   MMOVE = 2
    GO TO 11
8   MMOVE = 4
11  RETURN
5   IF (JSCOR - 135) 19, 19, 20
20  J = NMOV(2)
    GO TO (12, 12, 30, 31, 32), J
31  IF (NMOV(1) - 3) 12, 35, 12
35  IF (NMOV(3) - 3) 12, 16, 12
32  IF (NMOV(1) - 5) 12, 33, 12
33  IF (NMOV(3) - 5) 12, 16, 12
30  IF (NMOV(1) - 2) 12, 34, 12
34  IF (NMOV(3) - 4) 12, 40, 12
40  IF (NMOV(4) - 2) 12, 41, 12
12  IGAME = 1
    N = RANDO * 10.0 + 5.0
    GRASR = 0
    RETURN
16   IGAME = 2
    GRASR = 0
    RETURN
19   IGAME = 3
27   GRASR = 1
    RETURN
41   IGAME = 4
42   GRASR = 0
    IF (MOVEN - 118) 44, 43, 43
43  IGAME=2
44  RETURN
6   GO TO (21, 22, 27, 42), IGAME
21  IF (N) 23, 23, 24
23  GRASR = 1
    N = RANDO * 10.0 + 5.0
    RETURN
24  GRASR = JPICK
    N = N-1
    RETURN
22  GRASR = JPICK
    RETURN
    END

@SplitInfinity
Copy link

SplitInfinity commented Jun 29, 2017

@marcharper

I've studied the Fortran and unless I've made a huge mistake (which is possible), it is not the same as the Graaskamp strategy in the Axelrod documentation and in the very first post above. The Fortran code plays tit for tat for 50 rounds, defects in round 51, and plays tit for tat for 5 more rounds. But after that, it does things differently. In round 57, the Fortran code chooses one of four "modes" (stored by the variable IGAME) based on the opponent's score and the opponent's moves during rounds 52-56:

Mode 1
What: Tit for tat starting with cooperate, defect randomly every 5 - 15 turns
When: Anything other than the following cases

Mode 2
What: Tit for tat for rest of the game starting with cooperate
When: If at round 57, the opponent's score is greater than 135 and its last 5 moves were 52: C, 53: D, 54: C, 55: D or 52: D, 53: D, 54: D, 55: D (the move in round 56 doesn't seem to matter)

Mode 3
What: Defect for the rest of the game
When: If at round 57, the opponent's score is less than or equal to 135

Mode 4
What: Cooperate until move 118 and then transition to mode 2
When: If at round 57, the opponent's score is greater than 135 and its last 5 moves were 52: C, 53: C, 54: D, 55: C, 56: C


  1. Mode 2 above most likely corresponds to the opponent using the "Analogy" strategy since the consequence is to play tit for tat, but I'm not 100% sure
  2. @drvinceknight suggested using a chi squared test to detect if the opponent is playing randomly. But the Fortran code doesn't do anything like this.
  3. Step 5 of the Axelrod documentation for Graaskamp states that it cooperates and then randomly defects every 5-15 turns. But the Fortran code plays tit for tat and randomly defects every 5-15 turns.
  4. If the adversary was playing tit for tat, its move sequence would be 52: D, 53: C, 54: D, 55: C, but the Fortran code doesn't check for this.
  5. The Axelrod documentation for Graaskamp doesn't mention anything that resembles mode 4 above.

In light of these differences, what would you suggest I implement?

@marcharper
Copy link
Member

My guess is that one of the following is happening:

  • they are the same and that the description above comes from a summary (one of the books about Axelrod's tournaments perhaps) that doesn't fully explain the strategy
  • it's possible that the comment in the Fortran code is incorrect and this isn't the same as round one Graaskamp
  • it's possible that I misunderstood the comments in the Fortran code -- there are two versions of Graaskamp (and this could be the wrong one)

My guess is that this is the real strategy. The other Fortran strategy associated to Graaskamp looks totally different (see below).

If you're up for it (and it seems like you are!), please implement the strategy as you've deciphered it from the Fortran. @drvinceknight probably recommended using a Chi-squared test since that is what another strategy does and it's a reasonable way to do it. However Mode 3 may also be a way of detecting "random" play (or how someone interpreted that when writing the description).

Thanks for looking so closely at the details! There's a bit of archeology involved when implementing these strategies (often the hardest part).

C BY JIM GRAASKAMP AND KEN KATZEN
C FROM CARDS BY JM 2/22/79
    k60r=ja    ! Added 7/27/93 to report own old value
      IF (M-1)1,1,2
1     ID=0
      K60R=0
      GO TO 50
2     IF (ID-1)3,4,4
3     K60R=J
      IF (M-11)50,5,6
5     IF (K-23)51,50,50
6     IF (M-21)50,7,8
7     IF(K-53)51,50,50
8     IF (M-31)50,9,10
9     IF (K-83)51,50,50
10    IF (M-41)50,11,12
11    IF (K-113)51,50,50
12    IF (M-51)50,13,14
13    IF (K-143)51,50,50
14    IF (M-101)50,15,50
15    IF (K-293)51,50,50
51    ID=1
4     K60R=1
50    RETURN
      END

@SplitInfinity
Copy link

I'm almost ready to post a pull request with an implementation of Graaskamp as deciphered from the Fortran. I have a few more loose ends to tie:

  1. One of the four "modes" I described earlier is stochastic. I know that you can specify an RNG seed when writing a test case, but how can you know what sequence of random numbers you'll get (and therefore the sequence of moves the strategy will make)? Do you just have to try it and see?

  2. I'd like to add the Fortran to the project bibliography. What is the citation format for the bibliography and how should I cite the Fortran code?

@drvinceknight
Copy link
Member Author

Hi @SplitInfinity :)

One of the four "modes" I described earlier is stochastic. I know that you can specify an RNG seed when writing a test case, but how can you know what sequence of random numbers you'll get (and therefore the sequence of moves the strategy will make)? Do you just have to try it and see?

Yup, no smart way that I'm aware of, simply search for a seed which ultimately pins the behaviour you want.

I'd like to add the Fortran to the project bibliography. What is the citation format for the bibliography and how should I cite the Fortran code?

I'd suggest going with:

[Axelrod93] Axelrod, R. (1993) Center for the Study of Complex Systems Fortran source code Version 1.0. http://www-personal.umich.edu/~axe/CC/CC2/TourExec1.1.f.html

I've gotten those details from the header on the source code.

@drvinceknight
Copy link
Member Author

Hi @SplitInfinity just checking how this is going? How do you feel your strategy compares to the description at http://axelrod.readthedocs.io/en/stable/reference/overview_of_strategies.html#graaskamp

That description is from Axelrod's first tournament and the Fortran code is from his second. We've come to realise that it's not necessarily true that the author (Graaskamp) would have submitted the same to both.

@drvinceknight
Copy link
Member Author

It might be that we mark your strategy as MoreGraaskamp as it's based on the Fortran code from the second tournament. Axelrod's first tournament was written up in a paper called "Effective choice in the
Prisoner's dilemma" and his second in a paper called "More Effective choice in the Prisoners Dilemma".

Note we've created a separate/partner library that directly speaks to the Fortran code of the second tournament: https://github.com/Axelrod-Python/axelrod-fortran. We've opened #1095 to keep up with translating those Fortran strategies in to Python which I believe is what you have accomplished.

@SplitInfinity
Copy link

SplitInfinity commented Aug 4, 2017

Hi @drvinceknight

I've been on vacation for the last month - sorry for not communicating this to the team! The strategy as I deciphered it from the Fortran provided is completely different from the description in the Axelrod documentation (hence the confusion in my earlier posts on this issue).

Status

This gist shows the strategy and its tests so far: https://gist.github.com/SplitInfinity/ca42d7aa0defe143ee9a08187735fa00

The strategy itself has been implemented, and basic tests against a cooperator, defector and alternator have been written. These tests verify the behaviour of the strategy up to round 57, after which the strategy enters one of four "modes" as described earlier. I'm currently writing tests that test each of these four modes. One of the modes is stochastic, and I'm trying to figure out a smart way to test it.

Next Steps
Should I still send a PR to this repository, or should I send a PR to Axelrod-Python/axelrod-fortran?

@marcharper
Copy link
Member

marcharper commented Aug 4, 2017

Welcome back! Open the PR on this repository, please. For the stochastic tests please set random seeds for reproducibility. If you have trouble finding suitable seeds try looping the test over range(0, 1000) or similar. If you need help please let us know.

@drvinceknight
Copy link
Member Author

I've been on vacation for the last month - sorry for not communicating this to the team!

Absolutely no problem :) Hope you had a great vacation.

The strategy as I deciphered it from the Fortran provided is completely different from the description in the Axelrod

Ok that's great to have all confirmed. 👍

One of the modes is stochastic, and I'm trying to figure out a smart way to test it.

Not sure if you've seen but there's the option to pass a random seed to versus_test: http://axelrod.readthedocs.io/en/stable/tutorials/contributing/strategy/writing_test_for_the_new_strategy.html

So that lets you check behaviour for a fixed random path.

Should I still send a PR to this repository, or should I send a PR to Axelrod-Python/axelrod-fortran?

Please send it to this repo, the axelrod-fortran library is a wrapper of the Fortran ones.

I'm going to rename this issue to reflect that you have reverse engineered and are implementing a Fortran strategy from axelrod_fortran which corresponds to #1095.

I suggest you call the strategy MoreGraaskamp. (Axelrod's first paper on his first tournament was called "Effective Choice in the Prisoners Dilemma" and his second: "More Effective Choice ...").

I'll also open another issue to get the original Graaskamp strategy implemented :)

Thanks for your work on this. 👍

@drvinceknight drvinceknight changed the title Specific strategy: Graaskamp GRASR strategy Aug 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants