Why learn R in 2023?


Article author: Dmitry Volodin

Analytics Engineer at Trafficstars

Hello everyone, I’m Dmitry Volodin, Analytics Engineer from TrafficStars. Today I want to reflect a little on the demand for R and the feasibility of studying it.

The text will express personal experience and opinion, I will not conduct analytical work comparing average salaries and the number of vacancies in different languages. I’d rather share my thoughts. And I’ll try to stay as open-minded as possible.

R as first language

R was my first programming language that I used in my work. At school there were QBasic and Pascal, at the institute there was also Fortran. There was an attempt to start learning Python, but somehow it didn’t work. But with R, love (or match) happened at first sight.

R is called a complex language. In many respects just because of the fact that it is vector. There are no scalar data structures in R. You have to immediately treat vectors and lists as solid objects, to which functions are applied that return essentially new objects to you. That is, immediately keep in mind the change of the entire collection and its elements separately.

But I consider these aspects of R, on the contrary, its advantage for those who do not know programming and for whom Excel was the only data analytics tool. Vectorization is inherent in formulas in spreadsheets (as well as some elements of the functional paradigm). And if you are not familiar with programming, but you have done data analysis in Excel, then I strongly recommend that you start learning R, because it will be easier for you. Because R is primarily about working with data and analyzing it.

Another important advantage of R, I think, is the ease of getting the first results. It is extremely important for a just learning analyst to get the first sane results. Even in basic R, without any tidyverse, you can get an informative graph, aggregated data and stat inference in just a couple of lines of code without any boilerplate.

But what to do later, when the basics are clear and you can even already work and get paid? Learn more. R has excellent statistical and machine learning capabilities. With visualization (and even more so interactive), the language is also in full order. On the one hand, writing and delivering an analytical web application on Shiny is simple, but at the same time, any functionality can be set: from a simple chart with parameters to a management dashboard with functional elements. Also, MLflow and Spark have interfaces in R, that is, you can (and should) write analytical applications on a modern stack in R.

In addition, a very rare developer now knows only one language. R is not only a great first language, but also a convenient bridge to other languages ​​for working with data. From R it is very easy to roll into SQL, since the approach to working with tables is very similar. Yes, and with the same Python there will be no problems if you immerse yourself a little more in the imperative paradigm and learn algorithms.

For me, R remains the ideal language to start a career in data science. But it is definitely not worth locking up on it alone.

R as a second language

Here I will fantasize more, because there is no personal experience. If you are from an academic environment with experience in Fortran/Matlab/SPSS/Stata/APL, then R can become a modern, but at the same time quite familiar data manipulation tool for you. I’m afraid to kindle a fire in the comments with these statements, I remind you that this is my opinion, based on little theoretical knowledge about the listed tools.

If you are already a cool data scientist and deep learning in Python, then you should also look towards R. At least to broaden your horizons. Or for more convenient ways to process and visualize data. Well, for a huge arsenal of statistical tools. And besides, the usual Torch, TensorFlow with Keras and H2O are also in R.

For back- and front-end developers, I will not advise you to start your path in data analysis (if you have already decided) to start with R. Firstly, your familiar language probably already has something for data analysis, for example, data visualization libraries in JS. Secondly, it will be easier with Python, since it is a general-purpose language.

Labor market

The most painful topic for all R-specialists. Yes, there are many fewer vacancies in R than in Python. Even in the field of data science. But there is a downside: there are far fewer analysts in R than in Python. That is, the competition can be approximately the same. And the advantages from the first part of the article are still in place.

In addition, companies, especially small ones, are quite indifferent in what language you work with data. They are glad that, in principle, they got out of the vicious circle of 1C + Excel.

Once you master R and choose a corporate career path, you will sooner or later stumble upon SQL. And often one is enough to get a good job. I received an offer in a large Russian company for the position of a leading data engineer after testing only SQL and a little theory. And on the Hadoop stack, which I was not familiar with and reported about, of course.

The hype around IT has not subsided yet, and many are eager to jump into this train. My next advice will not be for applicants, students and university graduates, but for those who decide now to change their careers. I recommend that you start your journey precisely with data analysis in R. And become an analyst in your industry. Maybe right at your current company. So you will have two huge advantages: knowledge of modern and powerful tools for data analysis and wide experience in the industry (what is called domain specific knowledge over the hill and is considered a huge advantage for analysts). Well, then gain experience and new knowledge and your career will actively develop. Strictly speaking, this rule applies to any profession. It is necessary to study until the very retirement, but you should not stop there either.

In my R course, I just give the necessary minimum to start in the profession. After its completion, you will be able to receive, process and visualize data, conduct statistical analysis and present its results to customers.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *