YouTube Shorts from the terminal. How to Automate Video Creation Using FFMPEG and Bash

about its development process, and today I will touch on an equally important part – marketing.

To promote the game, I started publishing Shorts on YouTube, but it took a lot of time and resources. As an engineer, I try to automate routine tasks, so I created a solution that independently cuts videos into 60-second fragments. More details – under the cut.

Use the navigation to select the block you are interested in:

→ The problem with indie games
→ What happened before
→ Possibilities for automation
→ Tool selection and philosophy
→ Changing video format and blurred background
→ Conclusion

The problem with indie games

It is not enough for Gamedev developers to simply create a game for it to be popular with users. Marketing is an integral part of the process. Of course, you can count on the game itself to find its audience. But let's be realistic: even the coolest project can go unnoticed if you don't talk about it.

There are a huge number of guides on marketing indie games. I was able to not only get acquainted with them, but also try them out in practice. Below I share my conclusions, which can help novice gamedev developers.

There should be a lot of content. It is necessary to show the game to as many people as possible so that they recognize it and add it to their wishlist.

Frequency of publication is more important than uniqueness. Of course, adapting content for each social network is cool, but it takes a lot of time and resources. It is more effective to publish several simple materials than one unique one.

Vertical videos give good coverage. Moreover, it is not even necessary to record your voice, it is enough to show the gameplay.

So I started cutting horizontal videos into multiple Shorts to get extra views.

A small life hack to increase your reach on YouTube. If you uncheck the box when posting a video Publish to subscriptions feed and notify subscribersthen people familiar with the game will not see it in their feed.

Window with video parameters before publishing.

Do you like detective stories? Complete the quest “In Search of Missing Links”! Register on the site and try yourself in the role of a detective: find hidden links on the pages of Selectel and be the first to reach the final. Win exclusive merch and promotional code for Selectel services.

What happened before

Now I have two main YouTube channels – gaming and personal. On the first one, I publish horizontal and vertical videos to promote my game. Some of them go to GameJolt, Reddit and other sites. On the second – only horizontal ones, but I devote some of them to game development.

For editing I use CapCut. In my personal channel I make a video and add sound, and in a gaming channel I add a generated voice. This process is quite expensive, which is why the stability of publications often suffers. Adding fuel to the fire is the inconvenience of voice generation due to the 300 character limit. You have to generate voiceovers in several stages, which greatly slows down the preparation of new content.

Loren Ipsum, example text.

Possibilities for automation

The decision to automate the process did not take long to hatch, but what opportunities are there for this? Let me look at a few examples.

Cut horizontal videos to get more coverage.

Automate voice and visual generation for video creation. Moreover, you can animate a mascot, from whose behalf I will narrate the story.

Game mascot.

Automate the production of horizontal videos by selecting visuals. To do this, I will use content from the video pool and hash tables of tags based on semantic analysis of the text.

In this article I will start small and implement only the first point. I will leave the remaining solutions for the following materials.

So I need a “black box” for horizontal videos that will:

change the format from 16:9 to 9:16,

fill the extra space with a blurred background,

cut the video according to the timecode, but no more than 60 seconds,

add text with episode numbering and a call to watch the full video.

Tool selection and philosophy

I don’t see any point in bothering with the UI, so I’ll use bash scripts. For video processing, I choose the FFMPEG library – a superficial study showed that it will be sufficient. Installing on MacOS using Homebrew:

brew install ffmpeg

Packages for other platforms can be found Here.

To keep this project moving within the UNIX philosophy, I found a tool for downloading YouTube videos from the terminal – youtube-dl. I used it to delete my first rendered video.

The utility is easy to use. First I request available formats for downloading videos:

youtube-dl -F "https://www.youtube.com/watch?v=dQw4w9WgXcQ"

Afterwards I get the list:

[info] Available formats for dQw4w9WgXcQ:
format code  extension  resolution note
249          webm       audio only tiny   46k , webm_dash container, opus @ 46k (48000Hz), 1.18MiB
250          webm       audio only tiny   61k , webm_dash container, opus @ 61k (48000Hz), 1.55MiB
140          m4a        audio only tiny  129k , m4a_dash container, mp4a.40.2@129k (44100Hz), 3.27MiB
251          webm       audio only tiny  129k , webm_dash container, opus @129k (48000Hz), 3.28MiB
...
398          mp4        1280x720   720p  657k , mp4_dash container, av01.0.05M.08@ 657k, 25fps, video only, 16.62MiB
399          mp4        1920x1080  1080p 1180k , mp4_dash container, av01.0.08M.08@1180k, 25fps, video only, 29.83MiB
248          webm       1920x1080  1080p 1556k , webm_dash container, vp9@1556k, 25fps, video only, 39.34MiB
137          mp4        1920x1080  1080p 3024k , mp4_dash container, avc1.640028@3024k, 25fps, video only, 76.45MiB
18           mp4        640x360    360p  343k , avc1.42001E, 25fps, mp4a.40.2 (44100Hz) (best)

The best mark is not the best format, but a format with pre-glued video and audio.

I select the desired format for the video – in my case webm (251) and mp4 (137). In order not to download them separately, I use the addition option:

youtube-dl -f 137+251 "https://www.youtube.com/watch?v=dQw4w9WgXcQ"

I'm not sure if it's available for all videos – you'll have to check.

We've sorted out the preparations! Time to create a script.

Changing video format and blurred background

There is no need to create a separate script for this task. The solution will look like this:

ffmpeg -i video.mp4 -filter_complex \
"[0:v]scale=-1:1920,crop=1080:1920,gblur=sigma=20[bg]; \
[0:v]scale=1080:-1[ov]; \
[0:a]volume=1.0;
[bg][ov]overlay=(W-w)/2:(H-h)/2[mix]" \
-map "[mix]" -map "" -r 60 result.mp4 -y

The command runs the FFMPEG library to provide a video.mp4 file as input. Applying it to our stream

filter_complex

. Below I will tell you what filters I used.

The second line of the script proportionally changes the width of the zero video stream 0:v to its length scale=-1:1920. Trims video to vertical format crop=1080:1920 and blurs the background gblur-sigma=20. After – returns the converted video to a new bg stream.

The third line takes again 0:vproportionally changes its length by width scale=1080:-1 and returns the result to a new thread ov.

I transform the audio in the same way so that it doesn’t disappear. It may be possible to do this much simpler, but the audio stream cannot be sent directly to the map. Correct me in the comments if you know another way.

In the fourth line I mix the streams bg And ov and place the video in the center of the screen overlay=(W-w)/2:(H-h)/2. I move the result to mix. Afterwards, I send the streams to the final file using -map"[mix]" -map "". I set it to 60 FPS and save the result in result.mp4. Flag -y overwrites the file if it already exists.

Result.

A start! We were able to save a minute of work at the cost of just a few hours of research. Isn't this wonderful?

Video cutting

If the duration of the video exceeds 60 seconds, then YouTube will not allow it to be included in the Shorts section. To do this you need to modify the previous script.

Moreover, I want to cut the video not just into 60-second pieces, but divide them into logical parts. There are time codes in the description below the video. This means I can simply copy this text and pass it to the script, and it will trim the video automatically. Let's get started!

For timecodes on YouTube I use the following format:

00:00 - Text1
01:40 - Text2

They can easily be converted to an array of seconds:

time_codes=()
while read -r line; do
	minutes=${line:0:2}
	seconds=${line:3:2}
	minutes="$(printf "%.0f" "$minutes")"
	seconds="$(printf "%.0f" "$seconds")"
	time_codes+=("$(($minutes*60+$seconds))")
done < $2

If the distance between timecodes is more than 60 seconds, they need to be edited. Of course, you can automate this division, but this way I risk cutting off the video at illogical points.

Next, I look at the duration of the video using the command ffprobe:

lengths=()
prev=
for key in "${!time_codes[@]}"; do
  if [[ $key == 0 ]]; then
	prev=${time_codes[$key]}
  else
	current=${time_codes[$key]}
	length=$((current-prev))
	lengths+=("$length")
	prev=$current
  fi
done

vid_len=$(ffprobe -v error -select_streams v:0 -show_entries stream=duration -of default=noprint_wrappers=1:nokey=1 $1)
vid_len=${vid_len%.*}
length=$((vid_len-prev))
lengths+=("$length")

Now on to rendering the videos themselves! To do this, I'll modify the command

filter_complex

for i in "${!lengths[@]}"; do
  target="${1}_$(($i+1)).mp4"

  ffmpeg -ss ${time_codes[$i]} -i $1 -filter_complex \
"[0:v]scale=-1:1920,crop=1080:1920,gblur=sigma=20[bg]; \
[0:v]scale=1080:-1[ov]; \
[0:a]volume=1.0;\
[bg][ov]overlay=(W-w)/2:(H-h)/2[mix];" \
-map [mix] -map   -t ${lengths[$i]} -r 60 $target -y

done

Here I added a flag

-ss ${time_codes[$i]}

to render from a given moment in seconds, and also

-t ${lengths[$i]}"

which sets the corresponding video duration.

Great! The videos are now suitable for the Shorts section.

Adding text

The video is ready, all that remains is to add text. At the top I will display the video title and part number, and at the bottom I will display a call to watch the full version. You won't be able to center the text, so you have to generate each line separately. For convenience, I’ll put the text into a file with timecodes and split it into two lines.

Next, I modify the code that reads timecodes:

name1=""
name2=""
time_codes=()
while read -r line; do
  if [[ $name1 == "" ]]; then
	name1=$line
  elif [[ $name2 == "" ]]; then
	name2=$line
  else
	minutes=${line:0:2}
	seconds=${line:3:2}
	minutes="$(printf "%.0f" "$minutes")"
	seconds="$(printf "%.0f" "$seconds")"
	time_codes+=("$(($minutes*60+$seconds))")
  fi
done < $2

I create variables for text parameters:

text_size=80
margin_top=160
margin_bottom=320
line_spacing=100
font="$HOME/tools/Ubuntu/Ubuntu-Bold.ttf"
text_border=5

Now let's modify the command to generate clips:

 ffmpeg -ss ${time_codes[$i]} -i $1 -filter_complex \
"[0:v]scale=-1:1920,crop=1080:1920,gblur=sigma=20[bg]; \
[0:v]scale=1080:-1[ov]; \
[0:a]volume=1.0;\
[bg][ov]overlay=(W-w)/2:(H-h)/2,\
drawtext=text="$name1":fontfile=$font:fontcolor=white:fontsize=$text_size:x=w/2-text_w/2:y=$margin_top\
:bordercolor=black:borderw=$text_border,\
drawtext=text="$name2":fontfile=$font:fontcolor=white:fontsize=$text_size:x=w/2-text_w/2:y=$margin_top+$line_spacing\
:bordercolor=black:borderw=$text_border,\
drawtext=text="Часть $(($i+1))":fontfile=$font:fontcolor=white:fontsize=$text_size:x=w/2-text_w/2:y=$margin_top+$line_spacing*2+20\
:bordercolor=black:borderw=$text_border,\
drawtext=text="Полное видео":fontfile=$font:fontcolor=white:fontsize=$text_size:x=w/2-text_w/2:y=h-$margin_bottom-$line_spacing\
:bordercolor=black:borderw=$text_border,\
drawtext=text="на канале":fontfile=$font:fontcolor=white:fontsize=$text_size:x=w/2-text_w/2:y=h-$margin_bottom\
:bordercolor=black:borderw=$text_border[mix];" \
-map [mix] -map   -t ${lengths[$i]} -r 60 $target -y

Despite the large amount of code, everything is quite simple. Five drawtext commands have appeared in the background blur block. Each has text, font, color, size, x-position, y-position, stroke color and thickness.

Result.