Sign in

BigData Engineer | Full stack dev | I write about ML/AI in Digital marketing. | linktr.ee/mshakhomirov | @MShakhomirov

Complete Data Studio guide and BigQuery tutorial for Firebase users, Machine Learning enthusiasts and Marketers. All you wanted to know. Data Studio template included.

Data Studio template. Image by author

Have you ever wondered how to reduce user churn and save money spent on user acquisition? This article is about how to count those users who stay in your App in order to understand what makes them stay.

Who is this article for?

  • Marketers who have been tasked to create custom user retention dashboards.
  • Analysts who might want to create better reports.
  • AI and ML experts who will definetely want to predict user churn.
  • Firebase users who might want to question their retention numbers.

Prerequisites

If you like…


And start with data graphs for SQL transformation using Dataform

Dataform dependency graph. Image by author

Let’s say you are building your Data Warehouse solution using BigQuery, Snowflake, Redshit, etc. You were tasked to create a few reports and you need to create production and test environments for SQL transformations you do daily.

You would want to run data transformations as code too pushing changes to git repository. You are also looking for a nice and easy way to document everything, run SQL unit tests and receive notification alarms in case something goes wrong in your data warehouse.

An ideal way would be to create two separate data projects that are not overlapping and can be…


Complete guide using Tensorflow, Airflow scheduler and Docker

Photo by Setyaki Irham on Unsplash

Google AI Platform allows advanced model training using various environments. So it is really easy to train your model with just one command like so:

gcloud ai-platform jobs submit training ${JOB_NAME} \
--region $REGION \
--scale-tier=CUSTOM \
--job-dir ${BUCKET}/jobs/${JOB_NAME} \
--module-name trainer.task \
--package-path trainer \
--config trainer/config/config_train.json \
--master-machine-type complex_model_m_gpu \
--runtime-version 1.15

However, Google runtime environments are being deprecated from time to time and you might want to use your own custom runtime environment. This tutorial explains how to set one and train a Tensorflow recommendation model on AI Platform Training with a custom container.

My repository can…


Why I don’t trust User Activity Dashboards

User base scenarios. Image by author

A complete SQL guide for marketers and machine learning engineers. MAU, DAU and WAU. Firebase and BigQuery example. Nifty report template included. Read how to copy it in the end of this article. It’s free!

Counting Active users in your App might be tricky. What would you do? Count — active deviceIds or active accounts?

In order to effectively calculate your Active User numbers you will need a combination of both deviceId and userId.

Depending on your users you might face different scenarios where users may be using multiple devices, may have different accounts on one device and cross using…


Complete guide for scripting and UDF testing

Photo by Florian Olivo on Unsplash

Since Google BigQuery introduced Dynamic SQL it has become a lot easier to run repeating tasks with scripting jobs. Now we can do unit tests for datasets and UDFs in this popular data warehouse.

What is this article about?

This tutorial aims to answers the following questions:

  1. How to write unit tests for SQL and UDFs in BigQuery.
  2. How to link multiple queries and test execution.
  3. How to automate unit testing and data healthchecks.

All scripts and UDF are free to use and can be downloaded from the repository.


Getting Started

Complete Python comparison and Step by Step guide for any dataset. Kaggle User churn data.

Exploratory Data Analysis in Google Data Studio. Image by author.

Can we perform Exploratory Data Analysis with SQL?

— Yes, we can.

What is this article about?

It is about Exploratory Data Analysis (EDA) and aims to answer the following questions:

  • What is Exploratory Data Analysis (EDA)?
  • How to perform Exploratory Data Analysis (EDA) in Pandas (Python)?
  • How to perform Exploratory Data Analysis (EDA) in BigQuery SQL and how is it different from Pandas?
  • How to use dynamic SQL in BigQuery for Exploratory Data Analysis (EDA)?
  • How to create visualisations to explore your dataset in BigQuery / Pandas?
  • How to use Pandas/ BigQuery SQL to analyse relationships between variables for feature selection?

Who is this…


Simple and effective dashboard. With actual report names, users and labels. Handy template included

BigQuery and Data Studio cost monitoring dashboard. (Image by author)

If you are a BigQuery user and you visualise your data with Data Studio then you might want to answer the following questions:

  • What was the cost of each Data Studio report for yesterday?
  • How many times each query/report was run and who ran it?
  • What was the cost of queries for tables, datasets (e.g. production/staging/ analytics) with label X?
  • Can I be notified in case of a sudden surge in billed bytes amount?

Standard Google Billing dashboard won’t answer these questions.

According to official Google docs at the moment you can’t use labels for BigQuery jobs.

Read more here…


In 8 days. Quick learner’s guide for those who don’t have time to read the manuals. August 2020.

Google Cloud Certified Professional Data Engineer

Want to get this certification? Well it is not an easy one. You’ll need to do the homework. From what I read online people usually spend 2–3 month on preparation.

It’s not a secret that many of us won’t be using each of the Google products every day but we need to know them, right? This article is for those who don’t have time to read all the manuals. I will describe what I did to get ready for this exam in 8 days.

First of all I need to say that I didn’t have a clue how serious that…


How would you explain Firebase figures? Here is the answers. Neat template included as well as sample Firebase datasets for BigQuery analysis.

I am confused with what I see in Firebase very often. What is the data behind?

Now I use Firebase Crashlytics and Performance data in Google Data Studio as it helps me to better understand my users.

Crashlytics dashboard

If you want you can just copy the template. All sample datasets included. Let me know if you need them in BigQuery too and I will share it as a public dataset.


… or 5 tricks to ease the pain and do things better

My first BigQuery bill :)

If you are in data analytics this article is for you.

Google Data Studio is f̶u̶l̶l̶ ̶o̶f̶ ̶b̶u̶g̶s̶ a free tool to build custom reports and visualise your data (which Google claims is beautiful). My data is not so beautiful. I would say building dashboards with it is a struggle. And this post is about how to build you reports quicker with no sudden surprises like the one you can see at the top.

Yup :) This was my first BigQuery invoice. …

💡Mike Shakhomirov

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store