N Queens Problem

2022-03-17 CC-BY

Short introduction

More information on the eight queens problem is available on Wikipedia.

Is it possible to place eight chess queens on a chessboard so that no two queens attack?

Queens attack pieces on both diagonals and in the same column and row. Below is one possible placement for a chessboard with 15 × 15 squares. No one is under attack. I wonder what kind of representation does this placement have in the head of a chess grandmaster?

000000010000000 100000000000000 000100000000000 010000000000000 000000000100000 000000000000010 000000001000000 001000000000000 000010000000000 000000000000100 000000000000001 000000000001000 000001000000000 000000000010000 000000100000000

👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
One placement of queens on a 15 × 15 chessboard

Recursive enumeration of placing N queens in Python

I am not interested in a program that will try all possible placements and count those that satisfy the condition.

Given a partial placement, is there a fast way to figure out which rows, columns and diagonals are being attacked?

Turns out there is a fast check. Queen at position (i,j) sits on four lines. Vertical, horizontal, y = x + c and y =  − x + c. Replacing x and y by i and j we get four numbers that can be used to index an array:

Program 0

N = int(input())
Y, D1, D2 = [0]*N, [0]*N*2, [0]*N*2
CNT = 0
def place(i: int = 0):
    global CNT
    if i >= N:
        CNT += 1
        return
    for j in range(N):
        if Y[j] + D1[i+j] + D2[i-j]:
            continue
        Y[j]=D1[i+j]=D2[i-j]=1
        place(i+1)
        Y[j]=D1[i+j]=D2[i-j]=0

place()
print(CNT)

Program above prints the correct count that is listed on OEIS. Calculating the number of all possible queen placements for numbers bigger than 15 is too slow for me. It is nice that the program is always looking at one placement. After the attempt to place the rest of the queens – place(i+1), the arrays can be cleared and next j column can be tried for placement.

Note that Python lists work with negative indices (i-j is negative in many cases).

Printing all of the placements is also possible:

Program 0 with prints

N = int(input())
Y, D1, D2 = [0]*N, [0]*N*2, [0]*N*2
CNT = 0
S = [['0']*N for _ in range(N)]
def place(i: int = 0):
    if i >= N: 
        global CNT
        CNT += 1
        print('\n'.join(''.join(s) for s in S),
              end='\n\n')
        return
    for j in range(N):
        if Y[j] or D1[i+j] or D2[i-j]: continue
        Y[j]=D1[i+j]=D2[i-j]=1
        S[i][j]='1'
        place(i+1)
        S[i][j]='0'
        Y[j]=D1[i+j]=D2[i-j]=0

place()
print(CNT)
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
All possible placements for 6 × 6 chessboard. There is only 4 of them. If we stare a bit more, then it is apparent that it is just 2 placements rotated 1 time by 90 degrees (if colors of the squares are ignored). If mirroring is allowed (flipping over vertical or horizontal dividing line), then there is only 1 placement.

Recursive enumeration of placing N queens in C++

Python is a bit slow. Doing the same implementation in C++ should allow me to see some bigger numbers. In the grand scheme of things, the constant factor is negligible.

Program 1

#include <bits/stdc++.h>
using namespace std;
int N;
array<int, 40> Y, D1, D2;
int CNT= 0;
main() {
  cin >> N;

  function<void(int)> place= [&](int i) {
    if(i >= N) ++CNT;
    else
      for(int j= 0; j < N; ++j) {
	if(Y[j] + D1[i + j] + D2[i - j + N]) continue;
	Y[j]= D1[i + j]= D2[i - j + N]= 1;
	place(i + 1);
	Y[j]= D1[i + j]= D2[i - j + N]= 0;
      }
  };

  place(0);
  cout << CNT << endl;
}

Program above for N = 15 takes ~ 5% of time of the Python program (3.1s / 64.2s). It is a line by line translation of the Python program. Unfortunately, negative indices are not supported so “+ N” needs to be carefully added.

Recursive enumeration with bits in C++

The current world record for calculating the number of all possible placements was done for N = 27. [PrEn17] It should be possible to use something other than an array of O(N) elements because I cannot hope to beat that record with my personal machine. In this case the zeros and ones fit inside a 64-bit integer (there are ~ N × 2 diagonals).

Instead of tracking the same Y, D1 and D2 in bits, it is possible to apply bit tricks to make things even more efficient:

I recommend checking [Rich97] that is particular for this problem, [Warr13] that goes through various bit tricks in detail or very nice and detailed list that is available at this link. Enumeration of subsets of size k is one I remember using at some point.

Program 2

#include <bits/stdc++.h>
using namespace std;
using ull= unsigned long long;
ull N;
ull Y, D1, D2;
ull CNT= 0;
main() {
  cin >> N;
  ull ALL= ~(~0 << N);
  Y= D1= D2= ~0;
  Y&= ALL;
  function<void()> place= [&]() {
    if(Y == 0) ++CNT;
    else
      for(ull cols= Y & D1 & D2, c= cols & -cols; cols;
	  cols^= c, c= cols & -cols) {
	ull y= Y, d1= D1, d2= D2;
	Y^= c, D1^= c, D2^= c;
	D1= rotl(D1, 1), D2= rotr(D2, 1);
	place();
	Y= y, D1= d1, D2= d2;
      }
  };

  place();
  cout << CNT << endl;
}

Program above for N = 16 takes ~ 30% of the execution time of the first C++ program (6.2s / 21.6s). This is a nice improvement. Interesting note, removing int i from the arguments of function place shaves off 1 second.

Symmetry breaking: Do not enumerate placements that mirror another

It is evident that placing a queen in the first half of the first row results in placements that mirror those where queen is placed in the second half. For odd N, previous observation holds except when queen is placed in the middle square as illustrated in first figure). When queen is in the middle of the first row, then placing a queen in the first half of the second row results in placements that mirror those where queen is placed in the second half (of the second row).

Previous program can simply be edited to include the restrictions and decrease execution time. The CNT is multiplied by 2 at the end:

Program 3


#include <bits/stdc++.h>
using namespace std;
using ull= unsigned long long;
ull N;
ull Y, D1, D2;
ull CNT= 0;
ull ALL, MID;

template <bool R> ull init(ull x) { return R ? x & (ALL << (N / 2)) : x; }
template <bool R1= false, bool R2= false> void place() {
  for(ull c, cols= init<R1>(Y & D1 & D2); cols; cols^= c) {
    c= cols & -cols;
    ull y= Y, d1= D1, d2= D2;
    Y^= c, D1^= c, D2^= c;
    D1= rotl(D1, 1), D2= rotr(D2, 1);
    CNT+= Y == 0;
    R2 &&c == MID ? place<true>() : place();
    Y= y, D1= d1, D2= d2;
  }
}

main() {
  cin >> N;
  ALL= ~(~0 << N);
  Y= D1= D2= ~0;
  Y&= ALL;
  MID= 1 << (N / 2);
  N % 2 ? place<true, true>() : place<true>();
  cout << CNT + CNT << endl;
}

I have extracted the function and added two template arguments. First argument R1 indicates if the first row will be restricted. Second one R2 indicates if check for the middle in the first row is made and then the second row restricted in the recursive call. For N = 17 this program takes ~ 45% of the execution time of the previous C++ program (19.8s / 43.4s). If template code is replaced by simple checks for odd N and MID, then the execution takes ~ 3% more time. I moved c= cols & -cols from for iteration expression to the body of the loop and this improved the runtime by ~ 3%.

Conclusion and further improvements

First time I enumerated all placements for N 👸👸🏻👸🏼👸🏽👸🏾👸🏿 was decade ago, in Scala. It was slow and complicated. I wanted to revisit the problem and try to write code as simply as possible. That is the reason for commas, global variables, monstrous mutability and non-ISO C++.

Additional symmetries like rotation give more ways to cut off the search space. This was the case with mirroring where search space was cut by a factor of two. Current search happens row by row, from first to last. This cannot support avoiding search trees that will discover the rotation of the same placement.

It looks like counting A033148 might be easier for large N. Might be related to solutions where queens sit on a line. Instead of enumerating, trying to find the minimum number of queens that attack all unoccupied squares looks like a fruitful venture too.

Execution time in milliseconds of each program for varying N.
N Program 0 Program 1 Program 2 Program 3
10 32 2 2 2
11 104 5 2 2
12 506 15 6 3
13 2893 80 27 13
14 17560 481 152 70
15 103095 3092 958 431
16 21318 6347 2868
17 154662 44037 19828
18 326972 149369

Given the current trend of execution time in table, I can expect the count for N = 27 to appear after 300 years. Interestingly the number of placements for N = 27 is 234907967154122528 and it fits into a 64-bit integer, so I do not even have to change the program and start using _m128i.

To make my page even less interesting for the Google Search Engine I am going to add a bunch of emojis and tables here with 20 placements for board size 7 × 7 – output of program 3 (yes, HTML output).

👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑
👑

Bibliography

[PrEn17]
Preußer, Thomas B ; Engelhardt, Matthias R: Putting queens in carry chains, № 27. In: Journal of Signal Processing Systems Bd. 88, Springer (2017), Nr. 2, S. 185–201
[Rich97]
[Warr13]
Warren, Henry S: Hacker’s delight : Pearson Education, 2013