Making a crossword puzzle about ML using Monte Carlo and ChatGPT

This is what a crossword puzzle should look like

This is what a crossword puzzle should look like

And this is how GPT-4 Omni copes with the task

Prompt> Please make a short crossword puzzle about machine learning.

GPT-4O> Sure! Here's an example of a 10-word machine learning crossword puzzle.

Crossword grid

 | _ _ _ _ _ _ _
 | _ _ _
 | _ _ _ _ _ _ _ _ _ _ _ _ _
 | _ _ _     _ _ _ _ _     _ _ _ _ _
 | _ _ _ _ _ _ _ _ _ _
 | _ _ _ _ _ _
 | _ _ _ _ _ _ _ _ _
 | _ _ _ _ _    _ _ _ _ _ _ _ _ _
 | _ _ _ _ _ _ _ _
 | _ _ _ _ _ _ _ _ _ _ _ _ _
Questions
  1. An approach in machine learning where an algorithm is trained on labeled data.

  2. A method for reducing the dimensionality of data.

  3. A parameter that controls the complexity of the model and can prevent overfitting.

  4. A graph used to evaluate the performance of classification models.

  5. An algorithm that works on the principle of decision tree voting.

  6. One of the most popular clustering algorithms.

  7. A Python library for machine learning.

  8. A technique used to evaluate the accuracy of a model by partitioning the data into parts.

  9. A general name for methods that use multiple base models to improve accuracy.

  10. The process of adjusting model parameters to achieve better results.

Answers

S U P E R V I S E D
P C A
R E G U L A R I Z A T I O N
R O C   C U R V E
R A N D O M F O R E S T
K M E A N S
S C I K I T L E A R N
C R O S S   V A L I D A T I O N
E N S E M B L E
H Y P E R P A R A M E T E R

Doesn't look much like a crossword puzzle, does it?

Things are no better in English

Prompt> Please compose small crossword-puzzle about ML.

GPT-4O> Certainly! Below is a small crossword puzzle focused on Machine Learning (ML) terms.

10  | O | V | E | R | F | I | T |
 I  |   |   |   | D |   |   | S |
 N  |   | P | I |   |   |   |   |
 S  |   | R | - | H | V | A | R |
 E  | A | G | G | R | E | P | I |
 T  |   | E | I | L | I | O | U |
   4| E | L | M | A | I | N | C | V 
 |
Across

1. Essential component in programming used to store data. (8)
4. A type of neural network used mainly for NLP. (3)
6. The first stage in data processing. (4)
8. A loss function often used in regression tasks. (6, 3)
9. Technique to reduce overfitting. (8)
10. Model evaluation technique dividing data into training and testing sets multiple times. (10)

Down

2. Autonomous programs that replicate human decision-making. (5)
3. Technique used in SVM to create linearly separable classes. (9)
5. Algorithm paradigm inspired by the structure of the brain. (6, 8)
7. A statistical measure used to understand distances in clustering. (6)

Since GPT can't handle the task, we'll have to figure out how to help it. In this article, I'll tell you how to do it.
All code can be found in repositories.

TL;DR Crossword about ML

Here is a link to the crossword.
It will take you 20-30 minutes to guess. It is definitely solvable, tested on live programmers.
N.B. The crossword is in English, because all the professional jargon is nothing but Anglicisms and I decided it would be strange if the questions were in Russian and the answers in English.

Step #1 Collecting a word base

At first I wanted to write some ML terms off the top of my head, but my imagination was only enough for 30 words. I had to arm myself with a data-driven approach.

  1. I found a reference book on ML terms on GitHub – https://github.com/bfortuner/ml-glossary

  2. Written by script which parses unique words – there were about 3000 of them.

  3. I looked through them manually and left only those that are suitable for use in the crossword: nouns or verbs in normal form, if a word appears in several spellings, then I leave the “most normal” one.

  4. In the end, I got a list of 242 words.

It took me about half an hour of real time to review 3 thousand words.
In theory, the database could be filtered automatically using the distance between word embeddings and something like TF-IDF.

Step #2: Put them together in a crossword puzzle

There are many websites on the Internet that allow you to create your own crossword puzzle from a list of words, for example Crossword Labsbut the final crossword turns out to be very “sparse”, I would like to make it more compact.

Sprawling crossword

Sprawling crossword

I wrote a small program in C++ which randomly tries to assemble the most compact crossword puzzle, works like this:

  1. Place the word in a random place on the field horizontally

  2. We try to place the next word vertically in a random place so that there is at least one intersection with other words.

  3. Next we place the word horizontally again with at least one intersection. GOTO p2

  4. If the word cannot be placed, we look at the “occupancy” metric (the number of occupied cells divided by the number of free cells).

  5. We repeat steps 1 through 4 several times, choosing the crossword with the maximum “business”.

Step #3: Make Questions with ChatGPT

Praise LLM, even earlier this stage was the most difficult to automate, now everything is done quite trivially. We run through every word via API from OpenAI (in total, for 242 words it cost me less than 10 cents).

System prompt

You are avid ML engineer working on a little puzzle (crossword) to check the knowledge of ML of your colleagues.

Prompt

Please compose a short ai/ml themed clue for word '{word}', do not mention number of letters in the clue.

Step #4 Putting it all together into a web page

I don't know anything about web development, so I tried to reverse engineer a few sites to “borrow” their crossword rendering engine, unfortunately I didn't succeed. But after about 3 hours of searching the Internet, I managed to find a great repository on GitHub that did what I needed https://github.com/jweisbeck/Crossword

Unfortunately, there were a lot of bugs, but somehow I managed to fix them. Another Python script to put it all together and…

Voila – the crossword is ready https://udalovilia.com/crossword/

Instead of a conclusion

If you liked it, please share it with your friends, here is a link to the crossword.
And you can give mine a star GitHub repository.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *