cutting logs at ALROSA

Surely, you also wondered: “Does a shoemaker have to be without boots?” In our QCD and BI team, we decided that this is not only unnecessary, but generally wrong. In this article I will tell you how we implemented a system for monitoring the demand for dashboards. This is a kind of visualization of the usefulness of visualizations. Below the cut is a detailed story with a code example, so anyone can repeat this on their own infrastructure if you have also chosen a flexible BI platform for your tasks.

Let's get acquainted! My name is Nikita Chistyakov, and I am a QCD and BI specialist at ALROSA. We have been successfully using the Visiology platform as a corporate BI tool for a long time. Over the past 5 years of using this tool, we have managed to implement a lot of very interesting things. But let's return to our task. In the ALROSA ecosystem, BI is used to solve a whole range of problems. We use BI for maintenance and repair tasks, for corporate reporting, and for monitoring our own IT indicators. And every year there are more and more people who want to take advantage of BI. This means the number of dashboards is growing and the data model is becoming more complex.

It seems that you can rejoice in the demand for the direction and live in peace. But this is not our case! Considering that today we have more than 120 dashboards, a meticulous specialist has questions: “Are users really needing all of them?”, “Are there dashboards that are not used? Is it because they are inconvenient, or because other data is needed?” and so on.

A study of the capabilities of the VIsiology API showed that the platform easily provides all the necessary information to analyze the usefulness of dashboards. And we decided to make our own small module for monitoring access to dashboards.

Instructions for soldering monitoring

To briefly describe the process, it looks like this:

  1. We write a script in Python to collect information from the stand (we use the Python libraries Requests, json, pandas, sqlalchemy)

  2. We upload it to the DBMS (in our case MSSQL, but actually it doesn’t matter)

  3. We create a request to our DBMS on the side of the Visiology platform

  4. And at the same time we request user data from ActiveDirectory (AD)

Now let's move on directly to data collection. Each item is collapsed under a spoiler, and you can expand it if you are interested in seeing how all this is implemented in the code.

Step 1. Create a list of dashboards to study

First, we collect information on dashboards via the api “admin/api/dashboards” and create a dataframe with the dashboard IDs.

query_headers = {
        "X-API-VERSION": "3.7",
        "Content-Type": 'application/json',
        "Authorization": "Bearer " + token['access_token']
}
dashboards_requests = requests.get(url=ССЫЛКА НА СТЕНД/ + admin/api/dashboards ,headers=query_headers).text
dashboards = json.loads(dashboards_requests)
for one_dash in dashboards:
    dash_name = one_dash["Name"]
    dash_id = one_dash["_id"]
    try:
        if type(one_dash["Roles"]) != dict:
            if one_dash["Roles"]:
                for one_role in one_dash["Roles"]:
                    dash_role_df_dict['Наименование дашборда'].append(dash_name)
                    dash_role_df_dict['Id дашборда'].append(dash_id)
                    dash_role_df_dict['Роль'].append(one_role['Name'])
            else:
                dash_role_df_dict['Наименование дашборда'].append(dash_name)
                dash_role_df_dict['Id дашборда'].append(dash_id)
                dash_role_df_dict['Роль'].append(None)
        else:
            dash_role_df_dict['Наименование дашборда'].append(dash_name)
            dash_role_df_dict['Id дашборда'].append(dash_id)
            dash_role_df_dict['Роль'].append(one_role['Roles']['Name'])
dash_role_df = pd.DataFrame(dash_role_df_dict)
Step 2. Get a list of BI platform user data

We collect information about Visiology users through the api “admin/api/users”, the output is a dataframe array:

query_headers = {
        'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
        "authorization": "Bearer " + token['access_token']
}
visiology_users_requests = requests.post(url= ССЫЛКА НА СТЕНД/ + admin/api/users ,headers=query_headers, data=payload).text
visiology_users = json.loads(visiology_users_requests)
for one_user in visiology_users['data']:
    try:
        if 'UserName' in one_user:
            user_name_login = one_user['UserName']
            if user_name_login in (None, ''):
                user_name_login =None
        else:
            user_name_login = None

        if 'GivenName' in one_user:
            user_name_name = one_user['GivenName']
            if user_name_name in (None, ''):
                user_name_name = None
        else:
            user_name_name = None

        if 'FamilyName' in one_user:
            user_name_family = one_user['FamilyName']
            if user_name_family in (None, ''):
                user_name_family = None
        else:
            user_name_family = None

        if 'MiddleName' in one_user:
            user_name_otchestvo = one_user['MiddleName']
            if user_name_otchestvo in (None, ''):
                user_name_otchestvo = None
        else:
            user_name_otchestvo = None  

        if 'Email' in one_user:
            user_name_email = one_user['Email']
            if user_name_email in (None, ''):
                user_name_email = None
        else:
            user_name_email = None

        if 'IsInfrastructure' in one_user:
            user_name_tehnich_uchetka = one_user['IsInfrastructure']
            if user_name_tehnich_uchetka in (None, ''):
                user_name_tehnich_uchetka = False
        else:
            user_name_tehnich_uchetka = False

        if 'Created' in one_user:
            user_name_created = one_user['Created']
            if user_name_created in (None, ''):
                user_name_created = '1900-01-01'
            else:
                user_name_created = user_name_created[:10]
        else:
            user_name_created = '1900-01-01'

        if 'Created' in one_user:
            user_name_last_login = one_user['LastLogin']
            if user_name_last_login in (None, ''):
                user_name_last_login = '1900-01-01'
            else:
                user_name_last_login = user_name_last_login[:10]
        else:
            user_name_last_login = '1900-01-01'

        visiology_users_df_dict['Имя пользователя'].append(user_name_login)
        visiology_users_df_dict['Фамилия'].append(user_name_family)
        visiology_users_df_dict['Имя'].append(user_name_name)
        visiology_users_df_dict['Отчество'].append(user_name_otchestvo)
        visiology_users_df_dict['Email'].append(user_name_email)
        visiology_users_df_dict['Системный пользователь'].append(user_name_tehnich_uchetka)
        visiology_users_df_dict['Дата регистрации'].append(datetime.datetime.fromisoformat(user_name_created))
        visiology_users_df_dict['Дата последнего входа'].append(datetime.datetime.fromisoformat(user_name_last_login))
visiology_users_df = pd.DataFrame(visiology_users_df_dict)
[Имя пользователя], [Фамилия], [Имя], [Отчество], [Email]
Step 3. Collect IDs and match users

From api “/admin/api/userInfo” we get the user ID and join it with the previous dataframe by user name.

usernames = visiology_users_df['Имя пользователя'].unique()
for username in usernames:
    headers = {
        "Content-Type": 'application/json',
        "Authorization": "Bearer " + token['access_token']
    }
    payload = {
        "UserName": username
    }
    req = requests.post(main_url+'/admin/api/userInfo',headers=headers, data=json.dumps(payload))
    visi_user_sub_df_dict['sub'].append(json.loads(req.text)['_id'])
    visi_user_sub_df_dict['Имя пользователя'].append(username)
visi_user_sub_df = pd.DataFrame(visi_user_sub_df_dict)

visiology_users_df_to_output = pd.merge(visiology_users_df, visi_user_sub_df, how='left', left_on='Имя пользователя', right_on='Имя пользователя')

And in the end we get a dataframe:

[Имя пользователя], [Фамилия], [Имя], [Отчество], [Email], [sub]*

Sub – user id

Step 4. Count the number of inputs to dashboards

Using the dataframe from point 1, we go through it in a loop and make a request to “loki/api/v1/query_range”, where we look at the number of inputs for each dashboard

# Время по GMT
now = datetime.date.today() 
now = datetime.datetime(now.year,now.month,now.day)
yesterday = now - datetime.timedelta(days=1)# за вчера смотрим  
date_start = int(yesterday.strftime('%s'))
date_end = int((yesterday + datetime.timedelta(hours=24)).strftime('%s'))
for dash_guid_one in dashs_guids:
    if dash_guid_one not in (None, '0', np.nan, 'nan', 'None'):
        time.sleep(14)
        print(f'Дашборд - {dash_guid_one}')
        time_for_dash = time.time()
        params =  {
            'query': '{stream="stdout"} |= "'+dash_guid_one+'" |= " Guid: " |= "ScriptSourceSettings" |= "UserSub"',
            'start': date_start*1000000000,
            'end': date_end*1000000000,
            'limit': 5000
        }
        try:
            resp = json.loads(requests.get(f'{ ССЫЛКА НА СТЕНД/ }loki/api/v1/query_range', params=params).text)

            if resp['data']['result']:
                print('Есть входы')
                length_of_vh = len(resp['data']['result'][0]['values'])
                print(f'Количество входов - {length_of_vh}')
                for vhod in range(length_of_vh):
                    log_if_exist = resp['data']['result'][0]['values'][vhod][1]
                    UserSub_indexes = [m.start() for m in re.finditer('\\", UserSub', log_if_exist)]
                    ActionId_indexes = [m.start() for m in re.finditer('\\", ActionId', log_if_exist)]
                    times_indexes = [m.start() for m in re.finditer(',"attrs":{"component":"dashboard-service"},"time":"', log_if_exist)]

                    for one_entity in range(len(UserSub_indexes)):
                        one_usersub = log_if_exist[UserSub_indexes[one_entity]+14:ActionId_indexes[one_entity]-1]
                        one_time = datetime.datetime.fromisoformat(log_if_exist[times_indexes[one_entity]+51:times_indexes[one_entity]+70])
                        dash_views_df_dict["Id дашборда"].append(dash_guid_one)
                        dash_views_df_dict["sub"].append(one_usersub)
                        dash_views_df_dict["Время посещения"].append(one_time)
dash_views_df = pd.DataFrame(dash_views_df_dict)

As a result of all processes, the output is a dataframe with the following fields:

[“Id дашборда”], [“sub”], [“Время посещения”].

Step 5. Collect data from AD

We make a request to AD, and at the output we also get a dataframe:

[‘Email’], [‘Группа’]

JOIN this dataframe by [‘Email’] with previous dataframe [Имя пользователя], [Фамилия], [Имя], [Отчество], [Email], [sub]*

Step 6. Collect data 'Section – dashboard'

We access via api “admin/api/webAppDashboards”

query_headers = {
        "X-API-VERSION": "3.7",
        "Content-Type": 'application/json',
        "Authorization": "Bearer " + token['access_token']
}
razdel_dash_requests = requests.get(url={ ССЫЛКА НА СТЕНД/ + admin/api/webAppDashboards, headers=query_headers).text
razdel_dash = json.loads(razdel_dash_requests)
for one_razdel in razdel_dash:
    razdel_name = one_razdel['name']
    razdel_id = one_razdel['id']
    try:
        if type(one_razdel['dashboardsList']) != dict:
            for one_dash in one_razdel['dashboardsList']:
                dash_name = one_dash['Name']
                dash_id = one_dash['_id']
                dash_url = main_url[:-1] + one_dash['url']
                razdel_dash_df_dict["Наименование раздела"].append(razdel_name)
                razdel_dash_df_dict["Ссылка на раздел"].append(main_url + 'dashboardPreviews?sectionId=' + razdel_id)
                razdel_dash_df_dict["Наименование дашборда"].append(dash_name)
                razdel_dash_df_dict["Id дашборда"].append(dash_id)
                razdel_dash_df_dict["Ссылка на дашборд"].append(dash_url)
        else:
            dash_name = one_razdel['dashboardsList']['Name']
            dash_id = one_razdel['dashboardsList']['_id']
            dash_url = main_url[:-1] + one_razdel['dashboardsList']['url']
            razdel_dash_df_dict["Наименование раздела"].append(razdel_name)
            razdel_dash_df_dict["Ссылка на раздел"].append(main_url + 'dashboardPreviews?sectionId=' + razdel_id)
            razdel_dash_df_dict["Наименование дашборда"].append(dash_name)
            razdel_dash_df_dict["Id дашборда"].append(dash_id)
            razdel_dash_df_dict["Ссылка на дашборд"].append(dash_url)
razdel_dash_df = pd.DataFrame(razdel_dash_df_dict)

forming a dataframe [“Наименование раздела”, “Ссылка на раздел”, “Наименование дашборда”, “Id дашборда”, “Ссылка на дашборд”]

Step 7. JOIN them arrays

On the Visiology side there are JOINs – all our arrays are [‘sub’] And [‘Id дашборда’]

select  
posesh."Id дашборда"
,part_2."Наименование дашборда"
,User2."Имя пользователя"
,User2."Фамилия" +' '+ User2."Имя" +' '+ User2."Отчество" as "ФИО"
,LOWER(User2."Email")
,User2."Системный пользователь"
,User2."Дата регистрации"
,User2."Дата последнего входа"
,posesh."sub"
,User2."Группа"
,User2."Уволен"
,User2."Содержится в AD"
,1 as "1ka"
,part_1."Наименование раздела"
,CAST(posesh."Время посещения" as date) as "Время посещения по дням"

,part_1."i"
,posesh."Время посещения"
,part_1."Наименование раздела"+ '/' +part_2."Наименование дашборда" as "Раздел + Даш"
,CAST(DATEPART(week,posesh."Время посещения") as char(20)) + '/'+CAST(DATEPART(year,posesh."Время посещения") as char(20))"Неделя+Год"
,User2."Фамилия" +' '+ SUBSTRING (User2."Имя",1,1) +'.'+ SUBSTRING (User2."Отчество",1,1)+'.' as "ФИО короткое"
,CAST(User2."Площадка" as char) as "Площадка"
,LOWER(User2."Email") as "Email  OpenID Для JOIN БК"
,LOWER(User2."Email") as "Email  OpenID join vip"
,1 as "1ka Изм"

from [Visiology].[dbo].[_Python_Visiology_views_only] as posesh
LEFT JOIN [Visiology].[dbo].[_Python_Visiology_Users_Visiology_AD] as User2
on posesh."sub"=User2."sub"
LEFT JOIN (select "Id дашборда", "Наименование дашборда" from   [Visiology].[dbo].[_Python_visiologyParser_TEMP_MAIN_PART_2] 
group by 
"Id дашборда", 
"Наименование дашборда"
) as part_2
on posesh."Id дашборда"=part_2."Id дашборда"

LEFT JOIN [Visiology].[dbo].[_Python_visiologyParser_TEMP_MAIN_PART_1] as part_1
on posesh."Id дашборда"=part_1."Id дашборда"

Result

The result is a flexible data set that you can easily visualize in Visiology and see which dashboards are being used and which are not. Which employees go to the portal and look at the statistics that logically need to be looked at, and which don’t, in which departments the “data-driven management culture” is already taking root, and who still needs help with immersion and training.

I hope this article was helpful to you. If you have any questions about the implementation of the monitoring mechanism, ask them in the comments or private messages, I will definitely answer!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *