Fixing bugs in a program can be challenging, especially as the length of the code increases, which is very often in the case of Machine Learning and Data Science projects. Multiple teams work on the same project and sequentially develop different modules. Hence cleanliness and understandability matter a lot.
To make debugging easier, developers working on different modules often divide their code into smaller, manageable chunks called functions. This enables developers to identify the Function where the bug is located easily. In Python, functions can be either user-defined or built-in, which we will discuss in this blog.
After going through this blog, we will learn the following concepts:
Let’s understand the use of Function in our day-to-day life.
Let’s take an example of a washing machine. We input dirty clothes, and it produces cleaned clothes as output. The machine follows a set of instructions, or a program, to wash the clothes. One of the actions in the program is filling water for cleaning. This process includes:
This process is repeated multiple times during the cleaning and drying cycle. When programming the machine, creating a reusable module for the water-filling process is more efficient than writing the same instructions repeatedly. This is similar to how a function works in programming.
Let’s start by learning more about python functions.
Functions are a group of instructions that are organized together in a block. These instructions are only executed when the Function is called within a program. For example, let’s say we have ten different lists and want to arrange the elements in all the lists in ascending order. There can be two ways to perform this:
Indeed, the second option is much more sensible and will help decrease the code length. We will shortly see one implementation of the Function. If we list down several advantages of Functions in Python:
Now that we know about the functions, let’s learn how to create them.
In any programming language, we need to follow some defined rules which code should follow, and Python is not an exception here. We need to follow the defined syntax to write Python functions.
def function_name(inputs):
# Statements
return
The image above illustrates the structure of a Python function. The major components of the structure include:
Let’s create a function that calculates the square of any given number passed as an input to that Function.
def calc_square(num):
square = num*num
print("The square of %d is : "%num, square)
return
There is another form of writing function where we explicitly mention the data type of input arguments and the return.
def function_name(var1 : int, var2: str, var3: float) -> int:
# statements to be executed
return variable
Now, as we have generated a function, let’s learn how to use it or call it inside our programs.
Once a Python function’s framework is complete, we can call it anywhere in our programs. For example, earlier, we designed a calc_square function, and now we will call it for multiple input values.
calc_square(7)
# Output
The square of 7 is : 49
calc_square(10)
# Output
The square of 10 is : 100
If a Python function return value instead of an empty return like what we saw in the calc_square Function, then we need to store the return in some variable.
def calc_square(num):
square = num*num
return square
num = 7
sqr = calc_square(num)
print("The square of %d is : "%num, sqr)
# Output
The square of 7 is : 49
Now we know how these blocks of statements can be reused inside our programs. Let’s learn more about the types of these arguments using which we can call a Python function.
There are mainly three types of arguments:
These arguments are parameters already filled with some value in a Python function. For example:
def calc_multiply(var1, var2 = 7):
multi = var1*var2
return multi
The above Function calculates the multiplication between two variables, var1 and var2. But var2 has a default value of 7. Let’s see how two types of calls will affect the result.
# Type 1
multi = calc_multiply(11)
print(multi)
### Output
77
# Type 2
multi = calc_multiply(5, 9)
print(multi)
### Output
45
Please note that in type 1 calling, if we do not mention the value of the default variable, it will automatically consider the default value. But we can overwrite these default values by passing additional arguments, as shown in Type 2 calling. We must think about what will happen if we do not give any input value for the parameters with no default value. Let’s see:
multi = calc_multiply()
TypeError: calc_multiply() missing 1 required positional argument: 'var1'
Yes, it will throw a TypeError. It will suggest filling the positional argument: ‘var1’ with some value. One interesting thing to note is that the order of writing arguments should follow some basic rules per Python function syntax. One of those basic ones is the non-default arguments should be mentioned before the default arguments. So the example shown below will throw SyntaxError:
def calc_multiply(var1=7, var):
SyntaxError: non-default argument follows default argument
If we pass two values inside calcmultiply, it will, by default, treat the first value as var1 and the second value as var2. Input arguments into a Python function are linked to its name, and we can also pass arguments out of order. For example, we call the same calcmultiply Function like this:
multi = calc_multiply(var2=9, var1=5)
print(multi)
# Output
45
In Python functions, we can also pass a variable length argument using two special symbols:
Let’s see their examples for a better understanding:
# *args example
def cal_sum_of_arguments(**args):
sum = 0
for arg in args:
sum += arg
return sum
total_sum = cal_sum_of_arguments(1, 5, 7, 9, 11)
print("Sum of all arguments is : ",total_sum)
# Output
Sum of all arguments is : 33
# **kwargs example
def print_paired_arguments(**kwargs):
for key, value in kwargs.items():
print("Key = ", key, " value = ", value)
return
print_paired_arguments(var1='Enjoy',var2='Algorithms', var3='ml')
# Output
Key = var1 value = Enjoy
Key = var2 value = Algorithms
Key = var3 value = ml
The return statement exits the Function and brings code execution to the line where that function was called. Using the return statement, we can extract specified values from the executions inside the Function. In Python, we can return multiple values of varying data types from a function. For example:
def print_paired_arguments(**kwargs):
keys = []
values = []
for key, value in kwargs.items():
keys.append(key)
values.append(value)
return keys, values
keys, values = print_paired_arguments(var1='Enjoy',var2='Algorithms', var3=3)
# Output
keys = ['var1', 'var2', 'var3']
values = ['Enjoy', 'Algorithms', 3]
Everything is an object in Python, and variables reference these objects. But when we pass these variables as input arguments, a new reference to that object is created. This can be tricky to understand, but let’s see this via an example.
def referencing(var1):
var1[0] = 7
return
var = [1, 0, 3, 4, 7, 9, 11]
referencing(var1 = var) # here var1 is a new reference to the same var
print(var)
# Output
[7, 0, 3, 4, 7, 9, 11]
## Please note that the var got changed as we are modifying the object present at the same reference.
When we try to modify the var1 to something else altogether, it will start referencing some other object, and the link between previous and current referencing will break. Let’s see that with an example.
def referencing(var1):
var1 = [7, 9, 11, 12, 3, 4, 5]
return
var = [1, 0, 3, 4, 7, 9, 11]
referencing(var1=var)
print(var)
# Output
[1, 0, 3, 4, 7, 9, 11]
## Please note that the var did not change as we are modifying the reference itself.
In Python, we have functions that exist without a name. We use the def keyword for functions with a name, but we use the lambda keyword for anonymous functions. A lambda function can take any number of input arguments but always returns one value resulting from that Function. So a lambda function needs one expression, and it should not be a print statement because it will not output anything.
lambda_ = lambda var1, var2, var3: var1 + var2*var3
lambda_(1,2,3)
# Output
7
As we said, there can be mainly two types of Python functions:
This Function gives us the length of the variable passed as an argument of this Function. For example:
var1 = "This is a pen"
print(len(var1))
# Output
## ['T', 'h', 'i', 's', ' ', 'i', 's', ' ', 'a', ' ', 'p', 'e', 'n']
13 ## Because there are 13 characters in the variable var1
var2 = [1, 3, 5]
print(len(var2))
# Output
3
# Type function shows the data type of the var
type([1, 2, 3])
# Output: <class 'list'>
# Converts valid input into a tuple
tuple([1, 2, 3])
# Output:(1, 2, 3)
# Converts valid input into a string
str(7)
# Output:'7'
# Round off the valid input variable
round(6.6)
# Output: 7
# reverse the list
list(reversed([1, 2, 3]))
# Output: [3, 2, 1]
print() # prints the input passed
int() # converts the input into int
sorted() # sorts the list in ascending order
list() converts the input into a list
If we go through some GitHub repositories for Machine Learning and Data Science projects, one common practice is that developers use many functions which are not defined inside that python file. Still, that Function works. How?
Because these functions are not defined in that python script, but they are defined inside any library or framework, and we directly call that Function from our program. But to do so, we need to reference the function we call. For that, we use the import function, and one example is shown below.
import numpy.reshape
We call the reshape Function, defined in the Numpy library. Once done, we can use the reshape Function anywhere inside our script. We will learn about the Numpy library in a separate blog.
One of the best use cases of this functionality is that maintenance becomes smooth. We can segregate functions used for one task, make another python script, and call it from our main file. For example, data processing involves a broader range of in-built and custom functions. And while training, we directly need the processed data. If we place all functions corresponding to pre-processing script and call it now in our train script, the code will look much cleaner and self-explanatory.
While building ML projects, we need to define custom functions per our requirements and use the numerous in-built functions in Python. There are some most frequent in-built functions we use in ML or data science projects:
The print function is used to print the predefined statement or any string. This becomes very helpful in debugging the code by knowing which line is creating the problem. Large codes use print statements to provide the right message to debug the error.
a = "You are readin at EnjoyAlgorithms' blog."
print(a)
# You are readin at EnjoyAlgorithms' blog.
print( 'This is the message :', ' ', a)
# This is the message : You are readin at EnjoyAlgorithms' blog.
Example to show the help in debugging code.Suppose the code base is huge, and we must figure out which line produces errors. So we write multiple print statements inside our code. If any print is not executed, the problem lies before the print line, and we can focus on that area.
a = [1, 2, 3, 4, 5]
b = [0, 1, 3, 5]
c = a + b
print('This is a valid operation and output is :', c)
# [1, 2, 3, 4, 5, 0, 1, 3, 5]
d = a - b
print('This is an invalid operation for lists')
The abs function is used to find the absolute value of any number provided as input to this Function. For example:
a = -7
print('Absolute value of a is: ',abs(a))
# Absolute value of a is: 7
The type function is used to find the data type of any variable. We discussed this Function in our introduction to the data type blog. It returns the type of class to which the input object belongs to.
a = 7
print('Data type of a is :', type(a))
# Data type of a is : <class 'int'>
The sorted Function sorts the given list or tuple in ascending or descending order. While pre-processing the data samples, we sometimes need to sort the samples to extract the meaning from the data.
a = [9, 7, 5, 10, 11, 99]
print(sorted(a))
# [5, 7, 9, 10, 11, 99]
The enumerate Function is mainly used to write loops in python, and we discussed the details of this use case in our introduction to loop in python blog.
a = [9, 7, 5, 10, 11, 99]
for i, num in enumerate(a):
print('This is index number: ',i,' and this is value: ', num)
#This is index number: 0 and this is value: 9
#This is index number: 1 and this is value: 7
#This is index number: 2 and this is value: 5
#This is index number: 3 and this is value: 10
#This is index number: 4 and this is value: 11
#This is index number: 5 and this is value: 99
The split Function is used to split the string based on defined criteria. While building machine learning models on text data or analyzing text data, the split Function is used in almost every code. It returns a list after splitting the string.
a = "This is EnjoyAlgorithms, and you are reading an introduction to function blog"
print(a.split(',')) # here we split the string a based on ,
['This is EnjoyAlgorithms', ' and you are reading an introduction to function blog']
The range function is mostly used while writing the for loop in python programming, as we saw in the introduction to loop blog. It produces the sequence of integers in the defined range passed through an argument inside this Function.
for i in range(5,10): print(i)
# 5
# 6
# 7
# 8
# 9
The append function is mainly used to add a single element inside a list or array in python, and we can easily find this in any ML project code.
a = [9, 7, 5, 10, 11, 99]
a.append(77)
print(a)
[9, 7, 5, 10, 11, 99, 77]
The min and max functions find the minimum and maximum values in a list or array. This is used while performing the data analysis, and we want to see the maximum and minimum data we have.
a = [9, 7, 5, 10, 11, 99]
print(max(a))
# 99
print(min(a))
# 5
In this article, we discussed one of the basic concepts In Python programming, i.e., Function. We learned how to create a function and call it into our program. We also discussed the input arguments in greater detail with examples. Last, we saw some in-built python functions frequently used in Python programming. We hope you find the article enjoyable.
Enjoy Learning!