7 Tips for Developing Libraries That ChatGPT Needs

With the advent of generative text models such as ChatGPT, the world of software development is facing new challenges and opportunities. In this article, we’ll look at seven top recommendations for creating libraries and frameworks that make it as easy as possible for users of your products to interact with generative models and improve the quality of generated code. Applying these guidelines will help developers adapt to a new world where generative models are already becoming an integral part of the development process.

  1. Observe backward compatibility In general, this was good advice without chat gpt, but now it is doubly relevant. If earlier in your library you used the @handle_async decorator to indicate that the handler should process the function asynchronously, and now you decide to simply pass the async=True argument, then be prepared that ChatGPT will most likely not read your Release notes, but will continue to respond “on old”, based on a large number of usages on github and answers on SO. Thus, the initial study of the library architecture becomes a much more important task, since in the future it will become more and more difficult to break backward compatibility.

  2. Use obvious naming. If there are several solutions in your niche, try to make sure that the naming of at least the main methods and classes is the same, since it is difficult for ChatGPT to understand the use of which library the developer expects from it. For example, if there is a generally accepted standard – model.get_feature_importance() to get the importance of features in some model, try not to fence the garden from model.features.importance() or model.feature_importances_ (CatBoost XGBost example), it may be architecturally such a data structure and is more correct, but then you will encounter the fact that Copilot will issue incorrect code to analysts who use your solutions.

  3. Make documentation more example-oriented Describing in great detail what entities are in your library, what they do and how they interact with each other can be very useful for experienced developers, but it is easier for neural networks to look at thousands of code examples to give relevant answers.

  4. Add Type Hints Maybe for ordinary programmers, to indicate that user_id should be passed to all methods of your library as a string (as an int / as a UserID object / as a number with a floating point)) somewhere at the beginning of the documentation would be normal, but for neurons such as Copilot etc ( that parse your entire development environment) it will be much easier to understand the code if you explicitly specify user_id: str . Of course, this is true only for non-strongly typed languages)

  5. More text, less pictures Neural networks at this stage of development are not particularly able to deal with pictures, so the documentation should rely on text. You can also insert diagrams and graphs, but make sure that the text from them can be copied (that is, they need to be drawn using HTML + CSS, and not insert pictures from Photoshop)

  6. Extensive getting started If you are developing a rather niche solution, and you understand that large models will not be able to absorb the ability to use it, make a detailed copy-friendly getting started, so that you can feed it to the neural networks yourself before asking questions. New text models (for example Claude-instant-100k) can accept up to 100k tokens (~100k words) per request (for example ALL works written by A.S. Pushkin contain ~ 900k words), so your getting started model can easily eat and be able to answer some questions from developers.

  7. Documentation in English It seems that now this is already the norm, even if you are writing some kind of local product (for example, a library for visualizing metro maps), but I still regularly stumble upon non-English documentation. Models are much more willing to eat English text and spend a lot less tokens and generate much more relevant responses. In addition, English is very well suited for writing technical texts, thanks to the direct word order, the absence of cases and genders.

Interested in your opinion on this, how important is it to make “GPT-oriented interfaces?

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *