nerdexam
SnowflakeSnowflake

SOL-C01 · Question #254

SOL-C01 Question #254: Real Exam Question with Answer & Explanation

The correct answer is C: Create a Snowflake User-Defined Function (UDF) that encapsulates the feature engineering logic,. Creating a UDF is the most efficient way to perform feature engineering because the computation happens within Snowflake's compute engine. Snowpark's or standard SQL can then be used to call and retrieve data from UDFs. This minimizes data transfer. Option A still fetches interme

Querying and Performance

Question

A data scientist is working on a machine learning project using Snowflake Notebooks. They have a dataset stored in Snowflake and need to perform feature engineering. They want to write a Python function that takes a Snowflake table name and a list of SQL expressions as input, executes these expressions against the table, and returns a Pandas DataFrame containing the new features. Which approach is MOST suitable for creating and executing this function within a Snowflake Notebook, minimizing data transfer outside of Snowflake?

Options

  • AUse the `snowflake.connectors library to connect to Snowflake, execute each SQL expression
  • BUse the `snowflake.snowpark.functions.call_udf to call a UDF from Snowflake notebooks and
  • CCreate a Snowflake User-Defined Function (UDF) that encapsulates the feature engineering logic,
  • DUse the Snowflake web UI to create a view containing the feature engineered data. Load that view
  • EUtilize the `sqlalchemy' library to establish a connection to Snowflake, construct the SQL query

Explanation

Creating a UDF is the most efficient way to perform feature engineering because the computation happens within Snowflake's compute engine. Snowpark's or standard SQL can then be used to call and retrieve data from UDFs. This minimizes data transfer. Option A still fetches intermediate data. Options D create extra steps and extra objects. Option E isn't the standard approach when working with Snowpark in Notebooks.

Topics

#Snowflake Notebooks#User-Defined Functions (UDFs)#Snowpark#Data Transfer Optimization

Community Discussion

No community discussion yet for this question.

Full SOL-C01 PracticeBrowse All SOL-C01 Questions