BigQuery. What to do if you damaged or accidentally deleted tables
BigQuery is becoming an increasingly popular tool for working with data. With the growth of its prevalence, knowledge about it becomes more and more valuable. At the same time, in everyday practice, completely different situations can arise from which it is impossible to be 100% safe. In particular, it is possible to accidentally damage, fill in incorrectly, and maybe completely destroy the wrong partitions or the whole table. In this note, we will look at ways in which misunderstandings can be corrected.
If you messed up with the data, then the first thing to remember is:
If you have corrupted the table. BigQuery stores the state of your existing table at any point in time within the past 7 days. So you can roll back.
If you have deleted the table. You have 2 days to restore the destroyed table using a special command.
You can also refer to the official help on this issue, namely the section “BigQuery for data warehouse practitioners” >> “Backup and recovery” – link.
After realizing that everything is not so bad, you have 2 ways to restore your data: using an SQL query and using a special command in the console.
Situation 1: A table has been corrupted by an inappropriate operation
If you did not delete the table, but only damaged it by applying an unnecessary operation, then you can restore its state to any point in time during the last 7 days, as we wrote earlier. To do this, you just need to write a fairly simple SQL query.
An example for the case when you are sure that the table was “normal” 1 hour ago.
CREATE OR REPLACE TABLE
`имя_проекта.имя_датасета.имя_таблицы`
AS
SELECT
*
FROM
`имя_проекта.имя_датасета.имя_таблицы`
FOR SYSTEM_TIME AS OF
TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR)
Instead of `INTERVAL 1 HOUR`, of course, you can write any number of hours within 168, or, for example, `INTERVAL 1 DAY` , if your damage is noticed late. It is also possible not to restore the table immediately, but to look at the output result, of course without using the first part of the `CREATE OR REPLACE` query.
Situation 2. A partitioned table has been corrupted by an inappropriate operation
If you need to restore the state of a partitioned table, where a prerequisite is to specify a certain value of the column by which the table is partitioned, then you can deal with such a case. In this case, you need to write a query like this:
CREATE OR REPLACE TABLE
`имя_проекта.имя_датасета.имя_таблицы`
PARTITION BY
event_date
OPTIONS (require_partition_filter=TRUE) AS
SELECT
*
FROM
`имя_проекта.имя_датасета.имя_таблицы`
FOR SYSTEM_TIME AS OF
TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
WHERE
event_date <= CURRENT_DATE()
Situation 3: A table was accidentally deleted
Remember that you have 2 days to correct this situation. To restore a deleted table, you need to open the console by clicking ‘Activate Cloud Shell’ in the upper right corner, as shown in the picture below.
After that, a console will open at the bottom of the screen, where you can write commands to restore tables. When returning a remote table, you need to specify the moment when this table existed, but in microseconds. You also have a choice: either just restore the table or save it to some new one without restoring the old one. For example, you can practice saving it as a test before accurately restoring the old one in its original place.
With a simple restore, your command in the console will look something like this:
bq cp имя_датасета.имя_таблицы@1665946435000 имя_датасета.имя_таблицы
where the numbers 1665946435000 will represent your point in time according to the Unix Timestamp.
If you want to save the old restored table under a new name, then your command would look like:
bq cp имя_датасета.имя_таблицы@1665946435000 имя_датасета.имя_новой_таблицы
where the numbers 1665946435000 will represent your point in time according to the Unix Timestamp.
After entering this command, you will see a line where you need to press ‘y’, which means that you confirm the operation.
Here is an example of what it might look like in your console:
That’s it, your deleted table has been successfully restored.
To summarize, deleting or corrupting your table is not such a critical problem if you managed to keep within 7 days for the table that was damaged, and 2 days for the table that was mistakenly deleted. All these teams rescued us more than once. I hope that they will help you restore everything without loss.