Challenging projects for programmers to learn new things

I mostly learned programming on my own. When I had an exciting idea, I figured out what was needed to solve the problem. For example, when I became interested in how search engines work, I started reading about the computational efficiency of sets. This is how I discovered the problem of “how do I know if I’ve already crawled this URL?” when there were already thousands of them. To speed up the answer to this question, I used a set that takes O(1) to search through rather than O(n).

Learning what you need to solve a problem is fun, but following your own coding path leaves gaps in your knowledge. It seems to me that if you constantly set yourself difficult tasks, then these gaps will be filled in along the way. (Even if it takes longer than taking the course. Interest is an important motivator for moving forward; pursue what interests you.)

At the moment when I began to understand computational efficiency and strive to speed up my programs, I was just solving a problem related to a search engine. Since then, I sometimes wonder: what should I do next? What will be my next challenging task? This greatly depends on the knowledge you currently have; Some ideas make sense, others are not yet achievable. This is how we learn.

I decided to make my own list of projects that support my interest in programming. This is a series style list Challenging projects every programmer should try Austin Henley.

Create a search crawler

A crawler is a bot that crawls through web pages and stores their content. Crawlers are used by search engines to explore the web. The content of web pages is “indexed”. This means that the pages are saved somewhere for later searching.

I decided to create a search engine for a small community in which I participate. The community was aware of a list of about a thousand sites compiled by people contributing to the general wiki. I used it as a crawl list. You can create your own list of sites and write your own search engine. Let's say you can make a search engine for your favorite anime blogs. Or Taylor Swift fan sites. Anything, anything, as long as it’s on the Internet.

The most important discovery for me in this process was that the web is the Wild West. You can never expect someone's web page to be exactly what you want. Creating a search crawler is an exercise in teaching us how to reliably retrieve as many websites as possible without crashing the sites.

When creating a search crawler you will learn:

  1. How to download a web page

  2. Content crawling standards (robots.txt, meta tags for “adult” content)

  3. About polling rate limits

  4. About exponential shutter speeds

  5. When to Crawl and Re-Crawl Sites

  6. About content negotiation

  7. About Etag

  8. And much more

The web truly is the Wild West. But this Wild West has plenty of delightfully technical challenges to solve. This task occupied my brain for several months. Although my search engine is no longer active, this project gave me the confidence that I can do more than write tiny Python scripts.

Create an automatic completion system

Imagine you are writing a blog post. How can you automatically complete words based on a sequence of letters? Let's take this article as an example. If I started a word with “exponential,” how can I effectively suggest “exponential” as a ready-made word? This is where the difficulty lies. I won't go into detail about the inner workings of this project, but I had a lot of fun making it!

Once you understand how to automatically complete words, another challenge arises: how to recommend, which the word is completed automatically. If I enter “exponent”, how does the auto-completion algorithm know whether to recommend “exponent” or “exponential”?

Write a program to compress files

There are many great file compression tools out there, but have you ever wondered how they make those files smaller?

Here's a task for you: download original article as an HTML file. Write a program that produces a compressed version of an HTML file. Your program must be able to reproduce the file exactly. Everything should remain the same, including spaces and capital letters.

Here are some of the topics you might find useful in learning about this area (though if you are interested in this task, I recommend trying to write a compression program without reading too much about it!):

  • Information theory

  • Compression method using a dictionary (a method of describing information)

  • Huffman code (popular compression algorithm)

  • Entropy (a measure of the amount of “information” in a file; “information” is put in quotation marks intentionally)

Implement BitCask

BitCask is an algorithm for storing keys and values. Key-value stores map keys (names) to values ​​(blocks of information). For example, I could store information about posts on my blog like this:

{"title": "(Even more) challenging programming projects you should try", "published": "2024-02-28"}

Where title and published are names associated with values.

BitCask only works using your file system. Keys and values ​​are saved to files. You can only add information to each file. You can delete keys, but this redefines the old key rather than explicitly removing the value. BitCask documentation is a short and useful article, who helped me in implementing the algorithm. Task for you:

  1. Create cask (key-value store based on the BitCask algorithm)

  2. Add elements to cask

  3. Retrieve elements from cask

  4. Remove elements from cask

  5. Close cask

When cask is closed, a merge operation is performed, merging all cask files into one.

Write a programming language

Wow! Programming language? Only serious specialists are capable of this! I thought so too. But it turned out that this is not true! You too can create a programming language. And you don't even need to read hundreds of pages of theory before you start (however, the more you read, the more information you will have to design your language!).

Designing a programming language allows you to decide exactly how you want to write code. You can use ready-made patterns or come up with your own. You write the rules yourself.

There are many types of programming languages, but the best place to start is by writing a language that doesn't require a compiler. Perhaps I'm biased because that's how I learned, but the thought of having to write a compiler before I had a little more experience with programming language theory seemed daunting to me. So I wrote an “interpreted” language. The file is read, parsed, then executed; no compilation.

An interpreted language consists of several basic elements:

  1. “Grammar”, which determines the structure of the language;

  2. A “lexical analyzer”/“parser” that takes arbitrary text (for example, a script written in your language!) and turns it into an Abstract Syntax Tree (AST) based on your grammar;

  3. A system of symbolic expressions that reads a syntax tree and performs some operations on it.

Each of the three components listed above presents a separate technical challenge. I recommend starting with writing a grammar and a parser. Your grammar will define how the language works (that is, how to declare variables, the list of allowed and disabled characters, how nesting works or doesn't work). To interpret the grammar, you can use a ready-made lexical analyzer. I used Lark for Python. A symbolic expression system will take a syntax tree and apply program logic (that is, store variables, do math, manipulate strings and booleans, and do whatever else you want the language to do).

PS It's also a good idea to start with Lisp!

The list itself and other ideas that may be in it

The above list is not a mandatory checklist, but rather a source of inspiration for choosing your projects. Perhaps one of the points will inspire you to do something. Perhaps none of them will interest you. This is also normal. If this happens, I recommend keeping this post somewhere deep in your mind. Perhaps the day will come when one of the ideas appeals to you. Five years ago I probably would have considered the above list too complex. A year ago I would have dreamed of realizing it.

I recommend looking for interesting ideas in article by Austin Henley. It was thanks to his post that I wrote mine.

If you implement anything after reading this post, please let me know! Write a letter to readers [at] jamesg [dot] blog. Also, if you need help with the ideas presented above, please write too, I will be happy to help. I have completed all the tasks listed here; they continue to cultivate my childhood fascination with coding; I hope it will be the same with you.

Now the question is: what other ideas should be on this list? What do you think would be a great challenge for programmers?

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *