How to create your first open source project in Python (17 steps)

Every software developer should know how to create a library from scratch. In the process, you can learn a lot. Just do not forget to stock up on time and patience.

It may seem that creating an open source library is difficult, but you don’t need to be a battered life veteran to understand the code. Just like you do not need a sophisticated product idea. But perseverance and time will definitely be needed. I hope that this guide will help you create the first project with the minimum cost of both the first and second.

In this article, we will walk you through the process of creating a basic Python library. Remember to replace in the code below my_package, my_file etc. the names you need.

Step 1: Make a Plan

We plan to create a simple library for use in Python. This library will allow the user to easily convert a Jupyter notebook into an HTML file or Python script.
The first iteration of our library will allow you to call a function that will display a specific message.

Now that we already know what we want to do, we need to come up with a name for the library.

Step 2: Name the Library

It’s hard to come up with names. They should be short, unique and memorable. They should also be written in lowercase, without dashes and other punctuation marks. Underscore is not recommended. In the process of creating the library, make sure that the name you invented is available on GitHub, Google and PyPi.

If you hope and believe that one day your library will receive 10,000 GitHub stars, then it is worth checking if this name is available on social networks. In this example, I will name my library notebookc, because this name is accessible, short and more or less describes the essence of my idea.

Step 3. Set up your environment

Make sure you have Python 3.7, GitHub, and Homebrew installed and configured. If you need any of this, here are the details:

Python

Download Python 3.7 here and install it.

Github

If you don’t have a GitHub account, go this link and sign up for a free subscription. See how to install and configure Git here. You will need a command line utility. Follow the links, download and install everything you need, come up with a username and specify your email.

Homebrew

Homebrew is the library manager for Mac. You will find installation instructions here.

Venv

Starting with Python 3.6, it is recommended to use venv to create a virtual environment for developing libraries. There are many ways to manage virtual environments with Python, and they all change over time. You can read the discussion here, but as they say, trust, but verify.

Starting with Python 3.3, venv is logged in by default. Note that venv installs pip and setuptools since Python 3.4.

Create a virtual Python 3.7 environment using the following command:

python3.7 -m venv my_env

Replace my_env by your name. Activate the environment this way:

source my_env/bin/activate

Now you have to watch (my_env) (or the name you have chosen for your virtual environment) in the far left corner of the terminal.

When done, deactivate the virtual environment with deactivate.

Now let’s configure GitHub.

Step 4: Create an Organization on GitHub

GitHub is the market leader in version control registries. Two more popular options are GitLab and Bitbucket. In this guide, we will use GitHub.

You will often have to turn to Git and GitHub, so if you are not familiar with the system, you can turn to my article.

Create a new organization on GitHub. Follow the instructions. I called my organization notebooktoall. You can create a repository under your personal account, but one of the goals of the work is to learn how to create an open source project for a wider community.

image

Step 5: Configure GitHub Repo

Create a new repository. I called my notebookc.

image

Add .gitignore from the drop-down list. Choose Python for your repository. The contents of your file .gitignore will match the folders and file types excluded from your repository. You can modify .gitignore later to exclude other unnecessary or confidential files.

I recommend choosing a license in the Select a license list. It determines what users of your repository can do. Some licenses allow more than others. If you do not select anything, then standard copyright laws automatically start to apply. Learn more about licenses here.

For this project, I chose the third version of the GNU Open License Agreement, because it is popular, proven and “guarantees users the freedom to use, study, share and change software” source.

image

Step 6: Clone and Add Directories

Choose where you want to clone your repository or perform the following function:

git clone https://github.com/notebooktoall/notebookc.git

Substitute your organization and repository.

Go to the project folder using the desktop graphical interface or code editor. Or use the command line with cd my-project and then browse the files with ls —A.

Your source folders and files should look like this:

.git
.gitignore
LICENSE
README.rst

Create a subfolder for the main project files. I advise you to name it the same as your library. Make sure there are no spaces in the name.

Create a file named __init__.py in the main subfolder. This file will remain empty for now. It is needed to import files.

Create another file with the same name as the main subfolder, and add .py. My file is called notebookc.py. You can name this Python file as you like. When importing a module, library users will refer to the name of this file.

The contents of my directory notebookc as follows:

.git
.gitignore
LICENSE
README.rst
notebookc/__init__.py
notebookc/notebookc.py

Step 7: Download and install requirements_dev.txt

At the top level of the project directory, create a file requirements_dev.txt. Often this file is called requirements.txt. Naming him requirements_dev.txt, You show that these libraries can only be installed by project developers.

In the file, specify that pip must be installed and wheel.

pip==19.0.3
wheel==0.33.1

Please note that we indicate the exact library versions with double equal signs and full version numbers.

Pin versions of your library to requirements_dev.txt

A collaborator who forks the project repository and installs require_dev.txt pinned libraries using pip will have the same library versions as you. You know that this version will work for them. In addition, Read The Docs will use this file to install libraries when building documentation.

In your activated virtual environment, install the library in the needs_dev.txt file with the following command:

pip install -r requirements_dev.txt

I strongly recommend updating these libraries as new versions become available. For now, install any latest versions available on Pypi.

In the next article I will tell you how to install a tool that facilitates this process. Subscribeso as not to miss.

Step 8: Work with the Code

For demonstration purposes, let’s create a basic function. You can create your own cool feature later.

Type the following into your main file (for me it notebookc / notebookc / notebookc.py):

def convert(my_name):
    """
    Print a line about converting a notebook.
    Args:
        my_name (str): person's name
    Returns:
        None
    """

    print(f"I'll convert a notebook for you some day, {my_name}.")

Here is our function in all its glory.

Document lines begin and end with three consecutive double quotes. They will be used in the next article to automatically create documentation.
Save the changes. If you want to refresh the memory of working with Git, you can take a look at this article.

Step 9: Create setup.py

File setup.py Is a build script for your library. The setup function from Setuptools will create a library to load into PyPI. Setuptools contains information about your library, version number, and what other libraries are required for users.

Here is my setup.py file example:

from setuptools import setup, find_packages

with open("README.md", "r") as readme_file:
    readme = readme_file.read()

requirements = ["ipython>=6", "nbformat>=4", "nbconvert>=5", "requests>=2"]

setup(
    name="notebookc",
    version="0.0.1",
    author="Jeff Hale",
    author_email="jeffmshale@gmail.com",
    description="A package to convert your Jupyter Notebook",
    long_description=readme,
    long_description_content_type="text/markdown",
    url="https://github.com/your_package/homepage/",
    packages=find_packages(),
    install_requires=requirements,
    classifiers=[
        "Programming Language :: Python :: 3.7",
        "License :: OSI Approved :: GNU General Public License v3 (GPLv3)",
    ],
)

note that long_description set to the contents of the README.md file. The list of requirements specified in setuptools.setup.install_requires, includes all the necessary dependencies for the operation of your library.

Unlike the list of libraries required for development in the require_dev.txt file, this list should be as permissive as possible. Find out why here.

Limit the install_requires list to just what you need — you don’t need users to install extra libraries. Note that you only need to list those libraries that are not part of the Python standard library. Your user will already have Python installed if he uses your library.
Our library does not require any external dependencies, so you can exclude the four libraries listed in the example above.

A collaborator who forks the project repository and installs pinned libraries using pip will have the same versions as you. This means that they must work.
Change the setuptools information to match your library information. There are many other optional keyword arguments and classifiers – see list here. More detailed setup.py guides can be found here and here.

Save your code in your local Git repository. It’s time to move on to creating the library!

Step 10: Build The First Version

Twine is a collection of utilities for safely publishing Python libraries to PyPI. Add library Twine to the next empty line of the file require_dev.txt thus:

twine==1.13.0

Then secure Twine in your virtual environment by reinstalling the needs_dev.txt libraries.

pip install -r requirements_dev.txt

Then run the following command to create the library files:

python setup.py sdist bdist_wheel

It is necessary to create several hidden folders: dist, build and – in my case – notebookc.egg-info. Let’s look at the files in the dist folder. A .whl file is a Wheel file — an integrated distribution. The .tar.gz file is the source archive.

Whenever possible, pip will install libraries as wheels on the user’s computer. They install faster. When pip cannot do this, it returns to the original archive.
Let’s get ready to download our wheel and source archive.

Step 11: Create a TestPyPI Account

Pypi – Python library catalog (Python Package Index). This is the official Python library manager. If the files are not installed locally, pip gets them from there.

TestPyPI is a working test version of PyPI. Create account here TestPyPI and confirm the email address. Please note that you must have separate passwords for uploading to the test site and the official site.

Step 12: Publish the library in PyPI

Use Twine to securely publish your library to TestPyPI. Enter the following command – no changes are required.

twine upload --repository-url https://test.pypi.org/legacy/ dist/*

You will be prompted for a username and password. Do not forget that TestPyPI and PyPI have different passwords!

If necessary, correct all errors, create a new version number in the setup.py file and delete the old build artifacts: the build, dist and egg folders. Rebuild the task with python setup.py sdist bdist_wheel and reload using Twine. The presence of version numbers in TestPyPI, which mean nothing, do not play a special role – you are the only one who will use these versions of libraries.

After you have successfully downloaded your library, let’s make sure that you can install it and use it.

Step 13: Verify and Use the Installed Library

Create another tab in the shell and start another virtual environment.

python3.7 -m venv my_env

Activate it.

source my_env/bin/activate

If you have already uploaded your library to the official PyPI website, you can run the command pip install your-package. We can extract the library from TestPyPI and install it using the modified command.

Here are the official instructions for installing your library from TestPyPI:

You can force pip to load libraries from TestPyPI instead of PyPI by specifying this in index-url.

pip install --index-url https://test.pypi.org/simple/ my_package

If you want pip to also extract other libraries from PyPI, you can add – extra-index-url to point to PyPI. This is useful when the tested library has dependencies:

pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple my_package

If your library has dependencies, use the second command and substitute the name of your library.

You should see the latest version of the library installed in your virtual environment.

To make sure you can use your library, start an IPython session in the terminal as follows:

python

Import your function and call it with a string argument. This is what my code looks like:

from notebookc.notebookc import convert
convert(“Jeff”)

After I get the following output:

I’ll convert a notebook for you some day, Jeff.

(Someday I will convert a notebook for you, Jeff)

I believe in you.

Step 14: Pour the code into PyPI

Pour your code into a real PyPI site so people can download it using pip install my_package.

You can download the code like this:

twine upload dist/*

Please note that you need to update the version number in setup.py if you want to upload the new version in PyPI.

Great, now let’s upload our work to GitHub.

Step 15: upload the library to github

Make sure your code is saved.

My notebookc project folder looks like this:

.git
.gitignore
LICENSE
README.md
requirements_dev.txt
setup.py
notebookc/__init__.py
notebookc/notebookc.py

Exclude any virtual environments that you do not want to load. The Python .gitignore file that we selected when creating the repository should not allow indexing of assembly artifacts. You may need to delete the virtual environment folders.

Move your local branch to GitHub with git push origin my_branch.

Step 16: Create and Combine the PR

In the browser, navigate to GitHub. You should have the option to make a pull request. Click on the green buttons to create, merge the PR and to remove the remote branch.
Returning to the terminal, delete the local branch with git branch -d my_feature_branch.

Step 17: Upgrade the working version on GitHub

Create a new version of the library on GitHub by clicking on the releases on the main repository page. Enter the necessary release information and save.

It’s enough for today!

We will learn how to add other files and folders in future articles.
In the meantime, let’s repeat the steps that we have taken apart.

Result: 17 steps to the working library

  1. Make a plan.
  2. Give a name to the library.
  3. Set up your environment.
  4. Create an organization on GitHub.
  5. Configure GitHub Repo.
  6. Clone and add directories.
  7. Download and install requirements_dev.txt.
  8. Work with the code.
  9. Create setup.py.
  10. Build the first version.
  11. Create a TestPyPI account.
  12. Publish the library in PyPI.
  13. Check and use the installed library.
  14. Pour the code on PyPI.
  15. Pour the library on GitHub.
  16. Create and combine PR.
  17. Update the working version on GitHub.

image

Learn the details of how to get a sought-after profession from scratch or Level Up in skills and salary by taking paid SkillFactory online courses:


Read more

  • The coolest Data Scientist does not waste time on statistics
  • How to Become a Data Scientist Without Online Courses
  • Sorting cheat sheet for Data Science
  • Data Science for the Humanities: What is Data
  • Steroid Data Scenario: Introducing Decision Intelligence

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *