Lesson 3 - Abstraction, Functions, and Scope

Open In Colab

Open In Colab

Learning objectives: Students will be able to design, use, and write functions with consistent variable scoping and design choices.

Specific skills: * Writing psuedocode * Writing basic functions * Using functions * Using multiple arguments in a function and calling them inside the scope of the function * Writing local and global variables and pass them between scopes * Adding default values to a function parameter * Docstrings * Using assert to test a functions output

Table of Contents

Introduction

Abstraction is a core concept in programming; it’s the idea of taking something complex and assigning it a simpler and more reusable form. You’ve already learned about a form of abstraction in lessons 1 and 2: variables. When you assign a string, number, list, etc. to a variable you’re separating the item (the data in the variable) from the accession (the variable name).

This same idea can be applied to processes; these are called functions. Functions allow you to execute a series of commands using a single execution call and can be written to process multiple inputs by being called multiple times.

Abstraction is a fundamentally difficult concept to grasp in programming, so don’t feel bad if it takes a bit to wrap your head around.

Motivating Example

Consider a common training task: building a calculator that can do the following with two numbers: - Add - Subtract - Multiply - Divide - Calculate an exponent (raise the first number to the power of the second)

We choose this because it is straightforward to understand, has a clear requirement for inputs/outputs, and is easily testable. You already learned how to do all of these mathematical operations, so the only novel concept will be related to building and running functions.

Pseudocode

When writing functions it’s important to think about 3 things: - What does my function need to do? or What output should my function return? - What input does it require? - What steps need to happen to go from input to output?

Writing out these requirements and steps in natural language is an organized and efficient way to understand your function before writing it, which makes the creation and testing easier. This is called pseudocode because it is not code but explains the logic behind a section of code that’s organized in the same way as the code itself.

Here’s how I would write these notes for our calculator example:

  • Input:
    • Mathematical operation
    • Number 1
    • Number 2
  • Output:
    • Resulting value from operation
  • Process (pseudocode):
    • Check for inputted operation
    • Calculate the operation using two inputted values
    • Return resulting value

First example

With that, let’s write our first, and simplest, form of a function: a function to add two numbers.

def add(num_1, num_2):

    result = num_1 + num_2
    
    return result


"""
#Note: you can also return without an intermediate variable

def add(num_1, num_2):
    return num_1 + num_2
"""
'\n#Note: you can also return without an intermediate variable\n\ndef add(num_1, num_2):\n    return num_1 + num_2\n'

Notice how the function has three major components: - def add(num_1, num_2): which defines the name of the function and the inputs in the function declaration. - The commands to be run whenever the function is run, we only have one in this case: result = num_1 + num_2. - return result, which tells the function that it’s done everything it needs to and what value(s) to give to the global scope (more on that later). Whatever value(s) are returned can then be set as a variable when you call the function, as you’ll see in the next block of code. - Note: You can return multiple values by giving a comma separated list of them, e.g. return foo, bar, baz

Also notice that if you try to run the code block above nothing happens. This is because all we’ve done so far is define the function, meaning we built the machinery, but we haven’t actually ran it yet.

You’ve probably already run a function without realizing it though using the humble print(). If you’ve printed anything using that, you’ve run a function. print() takes in one or more strings and outputs them to the terminal. Calling a function is usually as simple as calling print, you type the name of the function followed by parentheses containing the inputs.

Let’s give this a go with our adding function in a couple different flavors, all of which are functionally identical:

#All are equivalent

#Positional args
add_result = add(2, 8)

#Keyword args
add_result = add(num_1=2, num_2=8)

#Positional args with variables
num_1 = 2
num_2 = 8
add_result = add(num_1, num_2)

#Keyword args with variables
num_1 = 2
num_2 = 8
add_result = add(num_1=num_1, num_2=num_2)

As you can see, there are a few different ways you can specify inputs for functions: 1. Positional arguments: assume the inputs are given in the same order as defined in the function definition (first argument for add() is always num_1, second is always num_2). 2. Keyword arguments: explicitly state which argument you are assinging during the function use, non-ordered.

Both methods are used in different contexts, generally it comes down to how complex a set of inputs are and your preferred style, just remember that positional args are dependent on the order they’re in during the definition, so it’s easier to lead to errors.

Scope

Now that we’re talking about input variables it’s important to consider scope. This is the concept that whatever happens in a function stays in a function unless returned, but the opposite is not true.

Take the demonstrations for example:

#Global scope: accessible to all
global_scope_tester = "foo"
print(f"Before function; outside of function scope: {global_scope_tester}")

def print_scope():
    print(f"Inside function; outside of function scope: {global_scope_tester}")

    local_scope_tester = "bar"
    print(f"Inside function; inside of function scope: {local_scope_tester}")

    return None

print_scope()
Before function; outside of function scope: foo
Inside function; outside of function scope: foo
Inside function; inside of function scope: bar

We can see that global variables are accessible to: - Code outside of functions - Code inside functions without passing them as input - (Warning: this is bad practice and should not be used often; leads to difficult testing and untraceable code)

You may notice that there is still a scenario missing: what if you try to print local_scope_tester outside of the function?

Try to run the following:

print(f"Inside function; outside of function scope: {local_scope_tester}")
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[4], line 1
----> 1 print(f"Inside function; outside of function scope: {local_scope_tester}")

NameError: name 'local_scope_tester' is not defined

As you may have guessed, this code doesn’t run. In fact, if you check carefully, you can see that the error type is a NameError, meaning that according to wherever the code is looking there isn’t a variable named local_scope_tester. That variable is considered local to the function, and unless you return the value you can’t access it outside of the function.

Here’s how to fix that:

#Global scope: accessible to all
global_scope_tester = "foo"
print(f"Defined before function; called outside of function scope: {global_scope_tester}")

def print_scope():
    print(f"Defined inside function; called outside of function scope: {global_scope_tester}")

    local_scope_tester = "bar"
    print(f"Defined inside function; called inside of function scope: {local_scope_tester}")

    return local_scope_tester

local_scope_tester = print_scope()
print(f"Defined inside function; called outside of function scope: {local_scope_tester}")
Defined before function; called outside of function scope: foo
Defined inside function; called outside of function scope: foo
Defined inside function; called inside of function scope: bar
Defined inside function; called outside of function scope: bar

This is the motivation for returning values and writing organized input/output for your functions, keeping scopes and i/o organized leads to readable and consistent code with minimal errors.

Default Values

Sometimes you may have a function that will rarely need a different input than a known value. In our addition function, let’s assume someone almost always wants to add 5 to a value. In situations like these it’s useful to add a default value in the function definition, this ensures known outcome for the developer but also makes it easier for users to run code if they don’t want to specify function inputs when the parameter value is (usually) obvious.

Here’s a re-definition of the addition machine with a default value of 5 for num_2:

def add(num_1, num_2=5):

    result = num_1 + num_2
    
    return result

add_res = add(num_1=2)
print(add_res)
7

Pretty simple, huh? To set a default value all you need to do is set the input parameter equal to the default value in the definition. Whenever the function is called it now only requires num_1, since the function already knows what num_2 is.

Of course in this case since addition is commutative (doesn’t matter which order the numbers are in) this is pretty useless. Let’s write a subtraction function where the order of inputs matters:

def subtract(num_1, num_2=5):

    result = num_1 - num_2
    
    return result

subtract_res = subtract(num_1=2)
print(subtract_res)

print(subtract(2, 3))
print(subtract(9))
print(subtract(num_1=43, num_2=12))
-3
-1
4
31

Hopefully you can see how default values are useful now, especially when you have a function being used many times. Default values are optional, use your best judgement to add them when they make sense for your function and the context it’s being used.

Docstrings

You can imagine that if you’re writing a lot of functions that do a lot of different things just having the names be different might not be enough to easily figure out what you need quickly. Luckily there is a standardized solution to this: docstrings. These are little descriptor blocks below a function definition to clarify what a function does, what it’s inputs/outputs are, and any other relevant information (such as stackoverflow citations!)

There are multiple docstring format standards, but I prefer the Google style. VSCode even has automatic docstring generators to make the process even easier. Here’s the template for a Google style docstring:

def function_with_types_in_docstring(param1, param2):
    """Example function with types documented in the docstring.

    General notes here

    Args:
        param1 (int): The first parameter.
        param2 (str): The second parameter.

    Returns:
        bool: The return value. True for success, False otherwise.

    .. Links:
        https://www.python.org/dev/peps/pep-0484/

    .. TODO
        foo
        bar

    """

Note that these types of text blocks can be used for a script as well, so you can have consistently organized code from top to bottom.

Let’s go ahead and define all the other functions needed for our calculator with some basic docstrings to exemplify the process:

def add(num_1, num_2):
    """Add two numbers together."""
    result = num_1 + num_2
    
    return result

def subtract(num_1, num_2):
    """
    Subtract num_2 from num_1.

    Args:
        num_1 (Union[int, float]): Number to subtract from.
        num_2 (Union[int, float]): Number to subtract with.

    Returns:
        Union[int, float]: Result of subtraction process.
    """
    result = num_1 - num_2
    
    return result

def multiply(num_1, num_2):
    """Multiply two values together, order doesn't matter."""
    return num_1 * num_2
    
def divide(num_1, num_2):
    """Wait, which number is the numerator and which is the denominator?"""
    result = num_1 / num_2
    
    return result

def exponent(num_1, num_2):
    """Take num_1 to the power of num_2"""
    result = num_1 ** num_2
    
    return result

Now that we have all our components we can build the calculator, remember here is our function design: - Input: - Mathematical operation - Number 1 - Number 2 - Output: - Resulting value from operation - Process (pseudocode): - Check for inputted operation - Calculate the operation using two inputted values - Return resulting value

Knowing that, we’ll follow this design pretty closely:

def calculator(operation, num_1, num_2):
    """
    Calculates a value given a mathematical operation and two values.

    Args:
        operation (str): Choice of ["add", "subtract", "multiply", "divide", "exponent"]
        num_1 (Union[int, float]): Value 1
        num_2 (Union[int, float]): Value 2

    Returns:
        Union[int, float]: Value returned from the desired math operation.
    """
    if operation == "add":
        res = add(num_1, num_2)
    elif operation == "subtract":
        res = subtract(num_1, num_2)
    elif operation == "multiply":
        res = multiply(num_1, num_2)
    elif operation == "divide":
        res = divide(num_1, num_2)
    elif operation == "exponent":
        res = exponent(num_1, num_2)
    else:
        print("Bad input")
        res = None

    return res

calculator("sum", 2, 4)
Bad input

As you can see this function follows a pretty clear logic: depending on the desired operation it runs the function defined above with the given values, and then returns the value gotten from that. This is meant to demonstrate how functions can be run inside of functions and the results used within the external function. This structure can be done repeatedly for a nested structure, or you can design a process to use a flat structure where you minimally nest function calls. The theory behind which is better is hotly debated, but my rule of thumb is that as long as it follows a consistent logic and you document your functions well nesting can be a great way to chain operations together, such as above.

Testing

Now that we’ve completed our function it’s time to test it. Testing is very underlooked in the world of scientific computing, but is arguably more important when dealing with high amounts of data and long workflows to ensure consistent results.

In Python testing is extremely straightforward with the assert command. This allows you to compare any two values, if they are the same it returns with a value of True, if not it returns a value of False. This allows us to test a given function by writing a “unit test” for the function, where we input a known value with a known outcome and test whether the function responds appropriately.

Here is a basic example:

#Passes, nothing happens
assert calculator("add", 2, 2) == 4
#Fails, raises an error
assert calculator("add", 2, 2) == 5
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[12], line 2
      1 #Fails, raises an error
----> 2 assert calculator("add", 2, 2) == 5

AssertionError: 

As you can see when an assertion fails it raises a specific type of error called an AssertionError. If you don’t want the program to proceed if the test fails, you can just let it error out. If you want to acknowledge it happened but keep going you can use something called a try/catch block, here’s an example:

try:
    assert calculator("add", 2, 2) == 5
except AssertionError:
    print("Test failed!")
Test failed!

While this is handy for small situations, it’s preferable to write unit tests for all functions and store them in separate modules for your code. While automated unit testing is outside of the scope of this lecture I highly recommend you read this excellent tutorial on RealPython.

Consistent and robust unit testing will not only save your code from silently producing erroneous results, it will ensure your code is reproducible even if you make changes.

Exercises

The following exercises will help you better understand functions, scope, and testing

  1. Write a function that; given two positional args in first and second position, returns them in the opposite order, i.e. second then first
# Question 1 code here
  1. Write a better docstring for the divide() function defined earlier
# Question 2 code here
  1. Write a function that prints a subtracts a global variable from a local variable and returns the result, print said result
# Question 3 code here
  1. Write a unit test for any of the calculator sub-functions above, write a conditional print statement based on the output as demonstrated above
# Question 4 code here