6 types of robots vs Zenith Daylong

6 types of robots vs Zenith Daylong GIB, Argine and Ben vs the Zenith players

#1 diana_eva

Group: Admin
Posts: 4,997
Joined: 2009-July-26
Gender:Female
Location:bucharest / romania

Posted 2023-December-12, 04:24

At public's request, Lorand ran another simulation, this time with 6 types of robots, in December 10th Zenith Daylong.

These are the robot scores over 1,920 boards (all of the deal pools in Zenith)

Click for the leaderboard.

Zenith Dec 10th
Gib Advanced = 53.91%
Gib Basic = 52.58%
Argine Advanced = 50.96%
Thinking Ben = 50.32%
Argine Simplified = 48.38%
Instant Ben = 45.64%

Argine is set to play 2/1 (it played 2/1 in the previous simulations too).
The Ben model used is trained on GIB advanced for the bidding, and on ACBL human hands for the play of hand.
Instant Ben just plays based on what the neural network says
Thinking Ben runs some simulations using info from the neural network before making a move.

#2 pilowsky

Group: Advanced Members
Posts: 3,763
Joined: 2019-October-04
Gender:Male
Location:Poland

Posted 2023-December-12, 06:33

Remarkable how little difference there is between the two GIB's in this more "real" situation.

Fortuna Fortis Felix

#3 mycroft

Secretary Bird

Group: Advanced Members
Posts: 7,423
Joined: 2003-July-12
Gender:Male
Location:Calgary, D18; Chapala, D16

Posted 2023-December-12, 10:08

Also interesting how it matches my (personally observed, so *so not* statistically relevant) memory that "no matter how bad 'the robots' bid and play, according to real club player humans, it's amazing how, when two robots fill in a half table, they end up 'third or fourth, at 54%'."

When I go to sea, don't fear for me, Fear For The Storm -- Birdie and the Swansong (tSCoSI)

#4 pescetom

Group: Advanced Members
Posts: 7,900
Joined: 2014-February-18
Gender:Male
Location:Italy

Posted 2023-December-12, 11:16

mycroft, on 2023-December-12, 10:08, said:

May well be so if most or all club players are playing systems similar to Gib 2/1.
My experience is that the robot pair does much worse (low 40s) when they don't understand the opponent's bidding.

#5 pilowsky

Group: Advanced Members
Posts: 3,763
Joined: 2019-October-04
Gender:Male
Location:Poland

Posted 2023-December-12, 21:07

Just to be clear, in those simulations, was each engine playing South with GIB advanced at N,E and W to simulate the same conditions faced by human Souths in that tourney?

Fortuna Fortis Felix

#6 diana_eva

Group: Admin
Posts: 4,997
Joined: 2009-July-26
Gender:Female
Location:bucharest / romania

Posted 2023-December-12, 23:10

pilowsky, on 2023-December-12, 21:07, said:

Just to be clear, in those simulations, was each engine playing South with GIB advanced at N,E and W to simulate the same conditions faced by human Souths in that tourney?

Yes.

#7 lorserker

Group: Full Members
Posts: 97
Joined: 2007-November-26

Posted 2023-December-13, 02:25

pilowsky, on 2023-December-12, 21:07, said:

Just to be clear, in those simulations, was each engine playing South with GIB advanced at N,E and W to simulate the same conditions faced by human Souths in that tourney?

We are doing another run with the same bot playing for both North and South. That setting is better to compare bots to each other. Like how does a pair of gibs compare to a pair of argines to a pair of bens, etc. Making a bot play with another bot as partner puts the bot at a disadvantage (especially true for argine)

#8 pilowsky

Group: Advanced Members
Posts: 3,763
Joined: 2019-October-04
Gender:Male
Location:Poland

Posted 2023-December-13, 05:59

diana_eva, on 2023-December-12, 23:10, said:

Yes.

Thanks.

Fortuna Fortis Felix

#9 lorserker

Group: Full Members
Posts: 97
Joined: 2007-November-26

Posted 2023-December-14, 03:21

Hi, an update with a slightly different setting.
Both North and South are played by the same robot, so each robot plays with itself as partner.

Argine Advanced = 53.72%
Gib Advanced = 53.17%
Thinking Ben = 50.77%
Argine Basic = 50.62%
Gib Basic = 49.83%
Instant Ben = 45.47%

This new setup has helped Argine and has hurt Gib Basic.
My interpretation is that Argine has fewer misunderstandings now as she plays with herself as partner (Argine does play 2/1, but a slightly different flavor of 2/1 than gib).
Gib basic's performance dropped because now it has a weaker partner (itself).

#10 Ditherer

Group: Members
Posts: 6
Joined: 2018-December-25

Posted 2023-December-23, 04:19

diana_eva, on 2023-December-12, 04:24, said:

Here is another interesting thing you might do with the data from this experiment: for each human h, calculate Advanced GIB's percentage P_h on the set of 16 boards played by h. What percentage of humans were beaten by the robot when playing the same set of boards? How did the robot's performance vary over sets, e.g., what were the minimum and maximum values of P_h?

#11 Ditherer

Group: Members
Posts: 6
Joined: 2018-December-25

Posted 2023-December-23, 04:19

diana_eva, on 2023-December-12, 04:24, said:

#12 thepossum

Group: Advanced Members
Posts: 2,567
Joined: 2018-July-04
Gender:Male
Location:Australia

Posted 2023-December-24, 21:23

Is it possible to separate the variance in bidding and play?
Not sure how you would do it of course, and maybe very time consuming
Sorry for thinking idly - maybe even how often the contracts were the defining factor

Also pondering annoyingly. I wonder how the likes of Qplus and other 2/1 engines would stack up
- and I may be a contrarian but I like the idea of random pairings of different 2/1 bots - isn't that the point of bidding systems

Maybe a simple interpretative AI interface on each - what does GiB mean lol

Page 1 of 1

You cannot start a new topic
You cannot reply to this topic

BBO Discussion Forums: 6 types of robots vs Zenith Daylong - BBO Discussion Forums