I enabled the flag General_log for a Google Cloud SQL MySQL instance to get all the queries logs in Google Cloud Logging. I get all the queries users make, but also all the queries made to probe and maintain the instance every seconds. Those queries are made from localhost by Google Cloud SQL. Those logs from localhost are about 256 mb of data per day.
I am not sure about excluding all the logs from localhost using a regular expressions in a Log router sink (maybe a bad actor could get localhost access...), so I started to make a lot of regular expressions to precisely exclude most of the localhost logs. It is working great, localhost logs are now around 10mb per day only.
But I am afraid the queries from localhost might change in the futur. The table I want to get all the logs and monitor at all cost is the users table.
So I thought maybe I could exclude all the localhost logs except the ones containing the string "users". I saw on stackoverflow that one way to get data from a table without using the name could be to use views. So maybe I could exclude the queries logs from localhost that contains "users" and "view"...
Is there any other way a bad actor could get the table rows data without using "views" or "users" strings in queries?
I would prefer to not have to maintain all those regular expressions in the future...
Thank you!
For example, here are queries logs I have filtered out already using specific exclusion regular expressions:
2024-02-02T13:19:39.469512Z root[root] @ [127.0.0.1]55555 7777777777 Query SELECT @@version
2024-02-02T13:19:55.817843Z root[root] @ [127.0.0.1]55555 7777777777 Query INSERT INTO mysql.heartbeat(id, master_time) VALUE(1, UTC_TIMESTAMP(6)) ON DUPLICATE KEY UPDATE master_time=UTC_TIMESTAMP(6)
There are so many types of queries logs made by the system, tell me if you need more examples.
Now I am filtering them all using regular expressions like this:
textPayload=~("^........................... root[root] @ [127.0.0.1]..... 7777777777 Query SELECT @@version$")