Friday, August 26, 2022
HomeData ScienceIs Apache Airflow DAG Authoring Certification Price Your Time? | by AnBento...

Is Apache Airflow DAG Authoring Certification Price Your Time? | by AnBento | Aug, 2022


An sincere Evaluation Of Astronomer Certification.

Picture By Pixably On Pexels

Massive a part of my work as an information engineer consists of designing dependable, environment friendly and reproducible ETL jobs.

During the last two years, Apache Airflow has been the primary orchestrator I’ve been utilizing for authoring, scheduling and monitoring knowledge pipelines.

Because of this, I just lately determined to problem myself by taking the Astronomer Certification for DAG Authoring which is supposed to evaluate information of designing and creating knowledge pipelines following greatest practices.

I cleared the examination a couple of month after beginning the preparation course supplied by Astronomer trigger I primarily studied throughout weekends and I actually wished to soak up what the course needed to supply me.

On this article, I want to share with you my sincere assessment on the course, the technique I used to clear the examination and reply to some questions amongst which:

  • Who is that this certification for?
  • What does the examination include?
  • Is it price your time?

Additionally, towards the top, I current 5 questions from the examination that I obtained fallacious and share with you why I made these errors and how one can keep away from them.

Astronomer is the main supplier of cloud-based knowledge orchestration platforms powered by Apache Airflow.

Their companies embrace deploying and managing one or a number of Airflow situations within the cloud, permitting purchasers to deal with constructing, working and monitoring knowledge pipelines, as a substitute of worrying about managing their environments.

The corporate at present affords two skilled certifications:

  • Apache Airflow Elementary Certification | Degree: Primary
  • Apache Airflow DAG Autoring | Degree: Intermediate

Particularly, finding out for the Apache Airflow DAG Authoring Certification, prepares you to design and create dependable knowledge pipelines in Python, following greatest practices.

The certification is addressed to all knowledge professionals (amongst which knowledge engineers, BI engineers, knowledge scientists) that persistently use Apache Airflow to carry out their job and want to show their information.

As a result of the examination is supposed to evaluate extra superior matters, Astronomer recommends no less than 6 months of sensible expertise with Airflow.

In addition they point out that “When you’ve got a strong expertise with creating DAGs, then it’s possible you’ll be prepared to use your abilities on to the certification examination.

Nevertheless, I strongly advocate you to make the most of the preparation course supplied by Astronomer. It is because, regardless of I’ve been utilizing Airflow for greater than 2 years, I had by no means utilized good a part of the ideas taught within the preparation course and people ended up showing very often within the examination.

The exams consists of 75 a number of selection questions and you’re given 60 minutes to finish it. The passing rating is 70%, fairly beneficiant as you solely want 53 right solutions to cross.

Nevertheless, please don’t underestimate the examination: with a view to nail it, you’ll have to present that you just grasp the completely different options that Airflow affords to create DAGs, the professionals and cons of every one in addition to their limitations.

Try to be assured whereas taking design decisions for knowledge pipelines in accordance with the particular use circumstances. It’s best to have a strong information about most typical operators and have familiarity with much less widespread ones, notably within the context of defining DAG dependencies, setting completely different branches, watch for occasions by way of sensors and many others…

What Technique Did I Use?

I bought the examination and preparation course in a bundle for $150. Normally this provides you entry to two examination makes an attempt, that means that in the event you fail as soon as, you’ll be able to retry for FREE.

Examination + Prep Course bought in a bundle on Astronomer web site

Then, I watched all of the movies within the preparation course as soon as, with out taking notes or spending an excessive amount of time on them and tried the examination straight after to get a way of the kind of questions and their problem. Funnily sufficient, I scored 50/75, that means that I failed however I used to be solely 3 right solutions away from the decrease threshold.

Nevertheless, at this level I knew precisely the kind of inquiries to count on and the matters on which I struggled essentially the most, subsequently I watched all movies a second (and in some circumstances a 3rd) time. At this spherical, I took a lot of notes and tried to copy a part of the code on my native Airflow atmosphere.

Lastly, one morning I made a decision it was time to re-attempt the examination: I managed to attain 62/75 which means 12 right solutions greater than the primary strive, however nonetheless a bit under my expectations (given all the extra time funding!).

I’ve been an “A” scholar up to now, however that doesn’t essentially pay the payments and is time consuming, so I’m very a lot glad with my virtually 83% right solutions fee as I can swap my deal with one thing else.

As soon as I handed the examination, I’ve obtained an official certificates that was shared as digital credential through Credly. The badge appears like this:

The digital badge I obtained on Credly.

Relating to assess if the the time I invested into the certification was price it, I’d truthfully say that I’m in between saying: “YES and NO”.

Why YES?

I reckon the course was effectively structured and properly delivered by the tutor. It was not the primary course I took with Marc Lamberti and I actually like his optimistic angle, and his accent, so watching movies was sort of entertaining.

By way of the course I obtained uncovered to quite a lot of matters and options that I by no means used earlier than in Airflow and this made me develop as skilled and can permit me to share this information again on the office.

Regardless of it’s not possible to find out how good somebody is with Airflow solely by way of a certification, I’d say that investing time and sources to check for the examination, exhibits employers that I’m dedicated on mastering Airflow and captivated with it. If something, I’m one step nearer to develop into an knowledgeable within the discipline.

Additionally, Astronomer needs to be thought-about a frontrunner available in the market when it takes to offering cloud-based Airflow companies, so this was the most effective (if not the one) option to get Airflow licensed.

Why NO?

Nevertheless, the truth that they haven’t any established opponents can be a disadvantage: it’s because providing Airflow certifications can be a secondary enterprise for Astronomer, used to promote their major companies and generate leads not directly.

For instance, I discovered bizarre for the examination to not be proctored: in the event you and your colleague take the examination one after the opposite and he or she passes it earlier than you try it, she’s going to be capable to have entry to the whole listing of questions and solutions and which means that you could possibly use it to know the precise options beforehand. And not using a proctored examination, persons are allowed to sit down exams with out respecting the worth of integrity and I don’t like the concept.

On prime of that, I’ve causes to consider questions aren’t rotating an excessive amount of (if under no circumstances) within the examination. I had such an impression as a result of I tried the examination twice and I can inform that, in each events, questions had been largely the identical. I’d counsel to Astronomer that having a bigger pool of randomly rotating questions would make the certification much more revered.

As a final level, regardless of being extraordinarily useful, the concept of getting two examination makes an attempt, induces college students to not put together totally sufficient (no less than for the primary strive – and I’m responsible of this too), trigger in case of failure, monetary penalties and peer-pressure shall be minimal. I’d counsel Astronomer to introduce some kind of problem after the primary try like the next passing rating.

Whether or not you’re pondering to take an Airflow certification or already been finding out for some time, realizing the kind of questions you’ll have to face on the day, might enable you to figuring out matters that require revision.

On this part, I current 5 mock questions which can be similar to those I discovered within the DAG Authoring examination, the place I supplied the fallacious reply. I’ll share with you the proper reply as a substitute and clarify why I personally obtained confused on the time.

Query #1

Your DAG has:— A begin date set to the first of January 2022
— A schedule interval set to @day by day
— An finish date set to the fifth of January 2022
What number of DAG Runs will you find yourself with?Choices- 3
— 4
— 5 → CORRECT ANSWER

Rationalization

The right reply is 5 DAG Runs in complete, as a result of the DAG shall be triggered for the primary time on the second of January at midnight and so forth till the sixth of January in accordance with the method:

 triggered_date = start_date + schedule_interval

So keep in mind that execution_date, start_date and triggered_date are three completely different ideas in Airflow and to compute the variety of DAG Runs you merely have to sum variety of triggered dates. Within the examination I obtained confused as a result of for some cause I believed the interval was unique, that means that the fifth of January was not included as a start_date. In fact this isn’t the caseFoolish me!

Query #2

What are some other ways of making DAG dependencies? (Choose all that apply)Choices— ExternalTaskSensor → CORRECT ANSWER
— BranchPythonOperator
— TriggerDagRunOperator → CORRECT ANSWER
— SubDAGs (even when you already know that it's BAD) → CORRECT ANSWER

Rationalization

This questions has a number of right choices (3 to be exact), as a result of ExternalTaskSensor, TriggerDagOperator in addition to SubDAGs all methods to create DAG dependencies, regardless of SubDAGs aren’t greatest apply.

Funnily sufficient, I obtained this proper on the first try however fallacious on the second as a result of I assumed they had been making an attempt to bias me by including the SubDAGs possibility, however that was truly additionally right.

Query #3

Are you able to run this process twice for a similar execution date (to backfill, for instance)?Choices— YES 
— NO → CORRECT ANSWER

Rationalization

This query is a bit tough in the event you don’t learn the code fastidiously: the proper reply is NO, as a result of as it’s, the SQL code can solely be run as soon as. To make the duty idempotent, in order that it may be run a number of instances, the code needs to be modified to:

CREATE TABLE IF NOT EXISTS planes(…)

I obtained this one fallacious as a result of I forgot to concentrate to the SQL code within the PosgresOperator and I knew by expertise that it’s potential to backfill by way of the UI or CLI, so I’ve naively answered YES.

Query #4

You need to course of your knowledge incrementally. Subsequently you might want to get the present execution date of your DAG Run.What's one of the simplest ways to get it from the PythonOperator?Choices- A → CORRECT ANSWER
- B
- C

Rationalization

Apparently the right reply is A, nevertheless I don’ recall seeing something like that within the preparation course. I’ve truly chosen B, trigger the **context variable will also be used to entry the execution date variable ds.

I’d counsel Astronomer to revise this query or create a devoted video that goes extra in depth on the subject. As it’s now, it appears a bit complicated to me.

Query #5

With the PythonOperator, what's the best approach to push a number of XComs directly?Choices- A 
- B → CORRECT ANSWER
- C

Rationalization

The right reply is B as a result of essentially the most environment friendly approach to push a number of XComs is certainly to specify the kind of the worth returned by the PythonOperator (that on this case is a dictionary) whereas defining the perform. This methodology is clearly addressed within the video about XComs with the TaskFlow API. Within the examination I wrongly chosen A, forgetting that the worth kind has additionally to be clearly acknowledged on prime.

On this article I shared an sincere assessment of the Apache Airflow DAG Authoring Certification supplied by Astronomer.

Whereas finding out to cross the examination, I couldn’t discover plenty of further materials and suggestions on the market, so that is my approach to give one thing again to the neighborhood and assist these individuals which can be occupied with getting licensed.

Hope the solutions and mock questions I shared, will enable you to to nail the examination very quickly and please be happy to contact me in the event you want extra assist.



RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments