Introducing Litex on Tang Nano 9K

The author always liked the idea of ​​Litex, a framework for easily assembling SoCs on FPGAs, but never had enough time to try it. It's time to change that and document the process! We will use FPGA board Sipeed Tang Nano 9Kwhich is relatively inexpensive hardware, however most of this article applies to any Litex supported FPGA.

It took some learning, Litex is written in Python, or more accurately, it uses Migen, a Python-based tool that generates Verilog. The author has never written much code in Python, let alone Migen. So, to master the basics of Litex, you had to do the following:

  1. Disassemble a minimal SoC example

  2. Configure the SoC with some peripherals already available in LiteX

  3. Write a custom application and run it on the created SoC

  4. Create a comfortable development environment

Before we continue, let's first install Litex and create an example!

Creating an SoC from an example

It's quite simple if you're using current Linux, if you follow management everything works on Debian 12. To build some examples, you need a standard or full config, and you also need to install the RISC-V toolchain. Luckily, the Quick Start Guide explains it all well.

Now about the Gowin toolchain, it is not open source, although it is free. To obtain a license you must submit an application. You can download it Here after registration. The open source toolchain is under development, but at the time of writing (a year ago) it is not yet ready for use with Litex.

Gowin's gw_sh executable needs to be added to the path, for example via .bashrc:

PATH="$PATH:/path/to/gowin/IDE/bin"

After installation, go to the “litex/litex-boards/litex_boards/targets” directory and run:

./sipeed_tang_nano_9k.py --build --flash

This will take quite a lot of time, compiling, synthesizing, placing and routing, and then flashing the FPGA. The LEDs will blink cheerfully, and after connecting the serial port at a speed of 115200 baud, a greeting will be displayed:

All this took some time, perhaps it would be appropriate to put some things in a Docker container.

As a result, we put together an example, although we have no idea what it does or how it does it. Fortunately, the kit includes sourcelet's take a look at it. The author took some time to remove everything he could from example sipeed 9K to make it more consistent with simple.py, and ended up with this:

import os
from migen import *

from litex.gen import *

from litex_boards.platforms import sipeed_tang_nano_9k

from litex.build.io import CRG
from litex.soc.integration.soc_core import *
from litex.soc.integration.soc import SoCRegion
from litex.soc.integration.builder import *

kB = 1024
mB = 1024*kB

# BaseSoC ------------------------------------------------------------------------------------------
class BaseSoC(SoCCore):
    def __init__(self, **kwargs):
        platform = sipeed_tang_nano_9k.Platform()

        sys_clk_freq = int(1e9/platform.default_clk_period)

        # CRG --------------------------------------------------------------------------------------
        self.crg = CRG(platform.request(platform.default_clk_name))

        # SoCCore ----------------------------------------------------------------------------------
        kwargs["integrated_rom_size"] = 64*kB  
        kwargs["integrated_sram_size"] = 8*kB
        SoCCore.__init__(self, platform, sys_clk_freq, ident="Tiny LiteX SoC on Tang Nano 9K", **kwargs)

# Build --------------------------------------------------------------------------------------------
def main():
    from litex.build.parser import LiteXArgumentParser
    parser = LiteXArgumentParser(platform=sipeed_tang_nano_9K_platform.Platform, description="Tiny LiteX SoC on Tang Nano 9K.")
    parser.add_target_argument("--flash",                action="store_true",      help="Flash Bitstream.")
    args = parser.parse_args()

    soc = BaseSoC( **parser.soc_argdict)

    builder = Builder(soc, **parser.builder_argdict)
    if args.build:
        builder.build(**parser.toolchain_argdict)

    if args.load:
        prog = soc.platform.create_programmer("openfpgaloader")
        prog.load_bitstream(builder.get_bitstream_filename(mode="sram"))

    if args.flash:
        prog = soc.platform.create_programmer("openfpgaloader")
        prog.flash(0, builder.get_bitstream_filename(mode="flash", ext=".fs")) 
        prog.flash(0, builder.get_bios_filename(), external=True)

if __name__ == "__main__":
    main()

Wow, that's about 50 lines, not bad. Turns out there's a lot of magic going on in Litex to keep the code compact, let's try to break it down!

First a few import directives and definitions,

from litex_boards.platforms import sipeed_tang_nano_9k

imports a platform file, this file contains a description of all inputs/outputs and peripherals, and also includes information about the programmer used and the frequency of the built-in clock generator. In the case of a custom board, such a file will need to be created from scratch.

The remaining directives enable migen, the HDL language used in Litex, and some basic blocks for creating the SoC.

import os
from migen import * 

from litex.gen import *

from litex_boards.platforms import sipeed_tang_nano_9k

from litex.build.io import CRG
from litex.soc.integration.soc_core import *
from litex.soc.integration.soc import SoCRegion
from litex.soc.integration.builder import *

kB = 1024
mB = 1024*kB

Now it's time to move to the end of the code and look at the main function:

def main():
    from litex.build.parser import LiteXArgumentParser
    parser = LiteXArgumentParser(platform=sipeed_tang_nano_9K_platform.Platform, description="Tiny LiteX SoC on Tang Nano 9K.")
    parser.add_target_argument("--flash",                action="store_true",      help="Flash Bitstream.")
    args = parser.parse_args()

    soc = BaseSoC( **parser.soc_argdict)

    builder = Builder(soc, **parser.builder_argdict)
    if args.build:
        builder.build(**parser.toolchain_argdict)

    if args.load:
        prog = soc.platform.create_programmer("openfpgaloader")
        prog.load_bitstream(builder.get_bitstream_filename(mode="sram"))

    if args.flash:
        prog = soc.platform.create_programmer("openfpgaloader")
        prog.flash(0, builder.get_bitstream_filename(mode="flash", ext=".fs")) # FIXME
        prog.flash(0, builder.get_bios_filename(), external=True)

First of all, LitexArgumentParser is imported and an instance is created. This is a very handy feature in Litex that makes it easy to configure the SoC using command line arguments. By doing:

./sipeed_tang_nano_9k.py --help

we will receive a complete list of parameters, here are just a few of them:

Yes, the processor type is just a command line argument, awesome!

The BaseSoc function is then called, which is used to configure the SoC. Let's look at this a little later. Litex Builder is then called with the SoC as an argument to build the final SoC.

Finally, the -load and -flash arguments are processed. They both call the OpenFPGALoader tool to either load the bitstream into RAM or flash it into SPI flash on the FPGA board. OpenFPGALoader is installed using the Litex_setup script.

And here is the SoC itself!

# BaseSoC ------------------------------------------------------------------------------------------
class BaseSoC(SoCCore):
    def __init__(self, **kwargs):
        platform = sipeed_tang_nano_9k.Platform()

        sys_clk_freq = int(1e9/platform.default_clk_period)

        # CRG --------------------------------------------------------------------------------------
        self.crg = CRG(platform.request(platform.default_clk_name))

        # SoCCore ----------------------------------------------------------------------------------
        kwargs["integrated_rom_size"] = 64*kB  
        kwargs["integrated_sram_size"] = 8*kB
        SoCCore.__init__(self, platform, sys_clk_freq, ident="Tiny LiteX SoC on Tang Nano 9K", **kwargs)

The BaseSoC class creates an SoC that will be passed to Litex Builder a little later. The original SoC in Litex contains the Vexriscv processor, wishbone bus, some RAM, ROM, timer and UART. These are all basic configurable options. Here we set the clock frequency and create a CRG, a reset and clock driver, which should contain all the reset and clock signals. For now there is only one clock signal, we will look at this in more detail later.

We also set the size of ROM and RAM, which, strictly speaking, is not mandatory if standard values ​​are suitable. All this information is passed to the SoCCore function.initwhich returns our SoC.

That's it, the minimal SoC is ready, amazing. A complete example can be seen at GitHub.

Now let's gradually add new features to it!

Adding CRG

Currently the CRG is very limited compared to the example shown, there isn't even a reset button! Let's change this and add a PLL and a reset.

class _CRG(LiteXModule):
    def __init__(self, platform, sys_clk_freq):
        self.rst    = Signal()
        self.cd_sys = ClockDomain()

        # Clk / Rst
        clk27 = platform.request("clk27")
        rst_n = platform.request("user_btn", 0)

        # PLL
        self.pll = pll = GW1NPLL(devicename=platform.devicename, device=platform.device)
        self.comb += pll.reset.eq(~rst_n)
        pll.register_clkin(clk27, 27e6)
        pll.create_clkout(self.cd_sys, sys_clk_freq)

Compared to the previous design, the CRG now uses one of the custom buttons as a reset input. The PLL is generated, so far with the same frequency at the input and output, but this can be changed by passing the requested clock frequency as a parameter, great! The reset signal resets the PLL, which in turn resets the processor.

It's time for the periphery

Litex already has quite a lot of peripheral devices available: timers, UART, I2C, SPI, etc. Unfortunately, the documentation leaves much to be desired, but after some research the author was able to get most of them to work.

Let's add some devices to the sipeed_tang_nano_9k.py file!

from litex.soc.cores.timer import *
from litex.soc.cores.gpio import *
from litex.soc.cores.bitbang import I2CMaster
from litex.soc.cores.spi import SPIMaster
from litex.soc.cores import uart

Done, this solves the problem with most common peripherals. Fortunately, initialization is also easy!

        self.timer1 = Timer()
        self.timer2 = Timer()
        
        self.leds = GPIOOut(pads = platform.request_all("user_led"))
        
        # Serial stuff 
        self.i2c0 = I2CMaster(pads = platform.request("i2c0"))
        
        self.add_uart("serial0", "uart0")
        
        self.gpio = GPIOIn(platform.request("user_btn", 1))

Two additional timers, several LEDs, I2C, UART and a GPIO input in a dozen lines of code. It's much simpler than VHDL or Verilog. Now the platform file needs to be supplemented so that Litex knows what to place on which inputs and outputs:

    ("gpio", 0, Pins("25"), IOStandard("LVCMOS33")),
    ("gpio", 1, Pins("26"), IOStandard("LVCMOS33")),
    ("gpio", 2, Pins("27"), IOStandard("LVCMOS33")),
    ("gpio", 3, Pins("28"), IOStandard("LVCMOS33")),
    ("gpio", 4, Pins("29"), IOStandard("LVCMOS33")),
    ("gpio", 5, Pins("30"), IOStandard("LVCMOS33")),
    ("gpio", 6, Pins("33"), IOStandard("LVCMOS33")),
    ("gpio", 7, Pins("34"), IOStandard("LVCMOS33")),
    
    ("i2c0", 0,
        Subsignal("sda", Pins("40")),
        Subsignal("scl", Pins("35")),
        IOStandard("LVCMOS33"),
    ),
    
    ("uart0", 0,
        Subsignal("rx", Pins("41")),
        Subsignal("tx", Pins("42")),
        IOStandard("LVCMOS33")
    ),

Great! But there is still a small problem, editing all this in the litex-boards repository is not entirely correct.

It's time to create a separate directory for all this, or better yet, use Docker.

Containerization

While discussing with a friend about running this on a MacBook (Gowin IDE is not available for Mac OS), he created a small Docker container to deploy, just specify the location of the license file and you're good to go! The author made a few small changes, mainly to set the working directory and add vim. So take a look at this repository and give it a try!

This will allow you to reliably run Litex with Gowin tools on any computer, regardless of operating system and distribution.

One problem has been solved, now we need to put things in order, the author settled on the following directory structure:

The “platform” directory contains the platform file, and “software” is the C source code for the SoC program, which can be found on my GitHub.

When starting a container, I mount volumes in Docker like this:

docker run --rm \                                
    --platform linux/amd64 \
    --mac-address xx:xx:xx:xx:xx:xx \
    -v "${HOME}/gowin_E_xxxxxxxxxx.lic:/data/license.lic" \
    -v ${HOME}/Documents/Git/LitexTang9KExperiments:/data/work \
    -it gowin-docker:latest

The license file is bound to the MAC address of the network card, so make sure you set your MAC address in Docker to match the one specified in your license. It is recommended to use a MAC address generator after making sure there are no potential collisions.

After starting the container, I immediately find myself in the desired folder,

./sipeed_tang_nano_9k.py --build

one step away from assembly.

Troubles with software

To begin, the author looked at a demo application in Litex and compiled it. It can be flashed by integrating it into the SoC's internal ROM, but this means rebuilding the entire SoC every time the code changes. This is quite inconvenient if you want to quickly make changes to the code.

Luckily, Litex has a great program called litex_term that can be used to load binaries and connect the terminal to the SoC.

The standard Litex BIOS supports loading and executing binary files, similar to the Arduino bootloader. It's quite simple to use:

litex_term /dev/TTYhere --kernel=yourapp.bin

To re-download the binary file after making changes, simply reboot the board!

There should be some RAM available on the SoC that is not used by the BIOS. It is logical that new code cannot be loaded into the BIOS RAM area. My choice fell on using internal HyperRAM FPGA. The example also uses this approach and it seems to work quite well. The code to add this to the SoC is as follows:

        # HyperRAM ---------------------------------------------------------------------------------
        if not self.integrated_main_ram_size:
            # TODO: Use second 32Mbit PSRAM chip.
            dq      = platform.request("IO_psram_dq")
            rwds    = platform.request("IO_psram_rwds")
            reset_n = platform.request("O_psram_reset_n")
            cs_n    = platform.request("O_psram_cs_n")
            ck      = platform.request("O_psram_ck")
            ck_n    = platform.request("O_psram_ck_n")
            class HyperRAMPads:
                def __init__(self, n):
                    self.clk   = Signal()
                    self.rst_n = reset_n[n]
                    self.dq    = dq[8*n:8*(n+1)]
                    self.cs_n  = cs_n[n]
                    self.rwds  = rwds[n]
            # FIXME: Issue with upstream HyperRAM core, so the old one is checked in in the repo for now
            hyperram_pads = HyperRAMPads(0)
            self.comb += ck[0].eq(hyperram_pads.clk)
            self.comb += ck_n[0].eq(~hyperram_pads.clk)
            self.hyperram = HyperRAM(hyperram_pads)
            self.bus.add_slave("main_ram", slave=self.hyperram.bus, region=SoCRegion(origin=self.mem_map["main_ram"], size=4*mB))
            
        self.add_constant("CONFIG_MAIN_RAM_INIT") # This disables the memory test on the hyperram and saves some boottime

One less problem, but I would like to be able to compile my own code, separately from the Litex repository, and at the same time use their ready-made drivers and so on. After some experimentation I came up with the following makefile.

All the magic lies in the build and header directories at the top:

BUILD_DIR=../../build/sipeed_tang_nano_9k
SOC_DIR=/usr/local/share/litex/litex/litex/litex/soc/
include $(BUILD_DIR)/software/include/generated/variables.mak
include $(SOC_DIR)/software/common.mak

Code is largely based on the demo application: the author first simplified it and then expanded it with new peripherals.

Peripheral drivers

After building an SoC with a set of inputs/outputs, I2C and other things, the desire arises to connect peripherals! Most of them are pretty easy to use, but there isn't really any documentation on how to do it. The best way is to look at the migen code and let Litex generate a file with all the registers. This can be done by adding the “-soc-csv” option. For example:

./sipeed_tang_nano_9k.py --build --soc-csv=soc.csv

will generate a soc.csv file with all the registers inside. The -soc-json and -soc-svd options are also available to generate files in JSON and SVD format, respectively.

Some C header files are also generated during the build. In particular, the csr.h file located in the build/sipeed_tang_nano_9k/software/include/generated/ directory is very useful. For small peripheral devices, using the functions from this file is quite feasible.

For example, the “gpio_in_read” function to read the GPIO state works as expected.

Drivers are available for some peripherals from Litex. For example, for I2C there is a great driver that can handle more than one I2C device created, awesome!

It's time to understand the use of interrupts.

Troubles with interruptions

Enabling interrupts on the Litex/FPGA side is quite simple, the irq.add function takes care of everything! For example:

        self.gpio = GPIOIn(platform.request("user_btn", 1), with_irq=True)
        self.timer1 = Timer()
        self.timer2 = Timer()
        
        # And add the interrupts!
        self.irq.add("gpio", use_loc_if_exists=True)
        self.irq.add("timer1",  use_loc_if_exists=True)
        self.irq.add("timer2",  use_loc_if_exists=True)

However, how to use them in software? After looking at the existing code, here is Here an interrupt handler was found. But there is a tiny problem:

void isr(void)
{
    __attribute__((unused)) unsigned int irqs;
    irqs = irq_pending() & irq_getmask();
    if(irqs & (1 << UART_INTERRUPT))
        uart_isr();
}

The author removed some #defines for clarity, so the code will only handle the UART interrupt for a standard UART! So you either need to change this file in Litex or not use the Litex libraries. Or make a small change:

// Weak function that can be overriden in own software for any IRQ that is not the uart.
// Return true (not zero) if an IRQ was handled, or 0 if not.
unsigned int __attribute__((weak)) isr_handler(int irqs);

// Override by default with return 0
unsigned int isr_handler(int irqs)
{
    return 0;
}

...

void isr(void)
{
    __attribute__((unused)) unsigned int irqs;
    irqs = irq_pending() & irq_getmask();
    if(irqs & (1 << UART_INTERRUPT))
        uart_isr();
    else
        if(!isr_handler(irqs))
            printf("Unhandled irq!\n");
}

So a simple function with an attribute weak defined above. This means that if the same function exists somewhere else, it will override weak function. If it is not there, it will be called weak function.

This means that if an interrupt occurs that is not a UART interrupt, the isr_handler() function will be called. If you implement it in your code, great, it will be called and executed. Otherwise, it's okay, the function from this file will be called.

In your own main.c you can simply do the following:

unsigned int isr_handler(int irqs)
{    
    unsigned int irqHandled = 0;
    if(irqs & (1 << GPIO_INTERRUPT))
    {
        GpioInClearPendingInterrupt();
        irqHandled = 1;
    }
        return irqHandled;
}

In this case, if a GPIO_INTERRUPT interrupt occurs, then it will be handled by returning 1, otherwise 0 will be returned and the interrupt handler can issue a warning 🙂

As part of the demo, the author created a program that reads data from a serial port and can execute several commands to test I2C, GPIO, timer interrupts, and so on. The full code can be found Here.

Now there is only one thing left to try. Build your own peripheral!

Build Your Own Peripheral

In order to master the creation of a peripheral device, the author decided to implement a simple PWM peripheral module. Something simple that generates a PWM signal at a given frequency and duty cycle. It should have a counter inside and when the counter is below or above a certain value, it will switch the output to control the PWM duty cycle.

It must have several registers:

  1. Enable registers to enable/disable PWM peripherals

  2. Divider registers to be able to create PWM signals at a lower frequency

  3. Max count registers, which must count up to this value and then reset their internal counter

  4. Duty factor registers: If the counter is below this value, the output state should be low, otherwise it should be high.

This all looks quite doable, and although Migen is implemented differently than Verilog or VHDL, it allows you to write compact code thanks to all the capabilities of Litex.

Creating a register and connecting it to the processor is very simple:

from migen import *

from litex.soc.interconnect.csr import *
from litex.gen import *

class PwmModule(LiteXModule):
    def __init__(self, pad, clock_domain="sys"):
        self.divider = CSRStorage(size=16, reset=0, description="Clock divider")

A few lines and a simple peripheral device is ready! It's just one 16-bit register, but it's amazing! No need to worry about CPU buses or anything like that. CSRStorage is not the fastest method, but for a peripheral device such as PWM it is sufficient.

So let's quickly create this peripheral!

from migen import *

from litex.soc.interconnect.csr import *
from litex.gen import *

class PwmModule(LiteXModule):
    def __init__(self, pad, clock_domain="sys"):
        
        self.enable = CSRStorage(size=1, reset=0, description="Enable the PWM peripheral")
        self.divider = CSRStorage(size=16, reset=0, description="Clock divider")
        self.maxCount = CSRStorage(size=16, reset=0, description="Max count for the PWM counter")
        self.dutycycle = CSRStorage(size=16, reset=0, description="IO dutycycle value")
        
        divcounter = Signal(16, reset=0)
        pwmcounter = Signal(16, reset=0)
        
        sync = getattr(self.sync, clock_domain)
        
        sync += [
            If(self.enable.storage,
                divcounter.eq(divcounter + 1),
                    If(divcounter >= self.divider.storage,
                        divcounter.eq(0),
                        pwmcounter.eq(pwmcounter + 1),
                        If(pwmcounter >= self.maxCount.storage,
                            pwmcounter.eq(0),
                        ),
                    )
                )
            ]
                    
        sync += pad.eq(self.enable.storage & (pwmcounter < self.dutycycle.storage))

Several additional registers and several internal counters to divide the clock signal and PWM counter. A complete and functional peripheral in just over 30 lines, amazing!

And to use this peripheral device in the SoC, you only need one line:

self.pwm0 = PwmModule(platform.request("pwm0"))

On the software side, only a few registers need to be initialized:

    pwm0_divider_write(10);
    pwm0_maxCount_write(1000);
    pwm0_toggle_write(400);
    pwm0_enable_write(1);

The complete code for the SoC can be found Here.

Conclusion

It was fun! From scratch to FPGA SoC with some custom peripherals is amazing. And all this with quite a few lines of code. Litex definitely made an impression!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *