First steps

This chapter provides an introduction to basic commands and useful built-in libraries.

As mentioned, this course will rely on jupyter notebooks. Code will be written in cells, which can be run independently. To run a cell, one can use the run symbol above the cells by clicking on it. A more convenient way is again using the corresponding keyboard shortcut: holding shift and pressing enter.
This will run the current cell and select the next cell below.

import sys
sys.version
'3.8.8 (v3.8.8:024d8058b0, Feb 19 2021, 08:48:17) \n[Clang 6.0 (clang-600.0.57)]'

Basic Python

Just starting a jupyter notebook and thus a python kernel already allows several basic operations. The perhaps most obvious functionality would be to use a cell to calculate a mathematical expression. To do so, just enter the expression and run the cell:

999+1
1000
# comments inside the cell, starting with '#' help readers to understand your code. 
# These lines are not run! 
2 * (2 + 2) - 4
4

While jupyter will print the result of such one line expressions immediately, writing both expressions in one cell will print only the last computed result:

999+1
2 * (2 + 2) - 4
4

In the common case, where code inside one cell will exceed one line or will produce more than just on result, the print() function can be used. Functions in python get their parameters in parentheses (more on functions later):

print(999+1)
print(2 * (2 + 2) - 4)
1000
4

Some basic operations and functions are

print('Addition:',       1 + 1 )      
print('Subtraction:',    2 - 1 )      
print('Division:',       1 * 2 )      
print('Multiplication:', 2 / 1 )      
print('Modulus:',        3 % 2 )      
print('Floor division:', 5 // 2 )     
print('Exponent:',       2 ** 3 )     
print('Minimum:',       min(2,5))
print('Maximum:',       max(2,5))
Addition: 2
Subtraction: 1
Division: 2
Multiplication: 2.0
Modulus: 1
Floor division: 2
Exponent: 8
Minimum: 2
Maximum: 5

Autocompletion is a very useful tool for writing code. Not only will it provide already defined variables. It will also show you available methods on different objects (more on objects later).

Autocompletion or suggestions for completion will appear when pressing tab in code, as long as it is not the beginning of a line (where it will indent the line, see loops, functions).
For instance, when typing pri and hitting tab, jupyter will complete it to print.

Help

In order to get information, for example on the print()function, help can be summoned by using yet another function:

help(print)
Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.

Data Types

The most basic data types in python are numeric, string and boolean. However, there are many, many more. In order for many operations wo work properly, the data types must be compatible with one another. With the type() function the datatype of an object can be printed.

a = 1
b = 1.0
c = 'two'
d = True
print('a:', type(a))
print('b:', type(b))
print('c:', type(c))
print('d:', type(d))
a: <class 'int'>
b: <class 'float'>
c: <class 'str'>
d: <class 'bool'>

Changing the type of a variable can be achieved by the respective functions:

print(int(3.14))

f = "2.5"
print(10 + float(f))
3
12.5

Some mathematical operations can also be applied to strings/lists:

a = 'Who would '
b = 'have thought'
c = '?'
print(a+b+(c*5))
Who would have thought?????

Packages and Modules

Python’s functionality can be expanded by loading modules to the namespace. A module may be seen as a chunk of python code for a special task someone has already written for you to use while package signifies the inclusion of the underlying file structure to load the modules.

In order to make a module’s complete functionality available, it has to be loaded to the namespace trailing the import statement. Calling a module function in this case requires the module name with a dot before the function: package.function().

A module can be imported under an alias, using the as statement. This can help someone reading your code to know from which module a function is imported while keeping the code shorter and thus better readable.

Tip

By convention all imports are done at the beginning of the code.

import math
print(math.sqrt(4))

import numpy as np
print(np.sqrt(4.0))
2.0
2.0

To import a single function (or class) from a module one calls from <module> import <function>. Here, the module’s name does not need to be added when calling the function. Similarly, an asterisk can be used as a wild card, as in from <package> import *, to import all functions (classes) from one module (so not to add the module prefix every time in the code).
The following examples uses the math module, which contains functions such as the sine, cosine or square root.

from math import sqrt
print(sqrt(4))
2.0

Note

Calling a function which is not imported results in an error.

Usually, python is rather explicit in where to look for your mistake.

#print(cos(0))    # would raise error as cos() function is not imported

With the wild card, all functions from the math module are now loaded to the namespace.

from math import *
cos(0)
1.0

Variables

Objects can be stored in variables by assigning them using =. Since everything in python is an object, anything can be saved in a variable: numbers, strings, lists, etc. Python will automatically recognize the format of a variable to let the user know about incompatibilities. Calling a variable before it is assigned will also throw an error.
Note that python is case sensitive, meaning that when assigning variables a ≠ A.

var_1 = 5 ** 2
print(var_1)
var_2 = 'twenty five'
print(var_2)
25
twenty five

Caution

Do not name your variables like built-in python objects!

In jupyter notebooks, built-in objects are automatically coloured green. For example:

list
int
dict
dict

Logical Conditions

In order to compare values, logical operators can be used. The most common are

Operator

Meaning

==

equal

1!=

not equal

>

greater than

<

less than

>=

greater than or equal to

<=

less than or equal to

The result will be a boolean: either True or False.
Comparing statements may also be combined by and or or and can be negated by not.

a = (1<2)
b = (0 == 1)
print(a, b)
print(a and b)
print(a or b)
True False
False
True

Branching

In a program, code must often be run conditionally on some input or previous result. The if, else and elifstatement can be used to select which branch of code shall be run. The condition is followed by a colon and the respective code for that condition must be indented (automatically with a line break or by pressing tab).

Caution

Indentation is not to be used for better readability.

Instead, it has a function and must not be used when not necessary!

a = 4

if a < 0:
    print('a is negative')
elif a > 0:
    print('a is positive')
else:
    print('a equals zero')
a is positive

Lists and Tuples

Lists and tuples are both an ordered collection of objects. While tuples cannot be modified, lists are rather flexible. Objects can be added to a list, removed or replaced. Lists appear in square brackets [], tuples in parentheses ().
Lists and tuples can be nested, meaning elements of a list or tuple can again be a list or tuple. Duplicates are also allowed. (A set() would not allow duplicates and is not ordered).
Elements can be accessed by indexing with square brackets behind the list name.

Note

Indexing in python starts with zero, so the first element of a list will have index 0.

my_list = [1,2,0,4,5]
my_tuple = ('a', 'b', 'c')

# length of object:
print('0. the length of my list is', len(my_list),', the length of the word "list"', len('list'))

# access first element
print('1. first element from my_list:', my_list[0])

# print variables more conveniently
print(f'2. second element of my_tuple: {my_tuple[1]}')

# to change an element
my_list[2] = 3
print(f'3. changing the third element in my_list: {my_list}')

# note that strings are lists of characters:
my_str = 'expression'
print(f'4. first letter in my_str: {my_str[0]}')

# last element 
print(f'5. last element of my_tuple: {my_tuple[-1]}')

# slicing
print(f'6. from element 3 to end of my_list: {my_list[2:]}')

# reverse order
print(f'7. reverse order of my_list and slice: {my_list[3::-1]}')

# delete element
del(my_list[4])
print(f'8. add new element: {my_list}')
0. the length of my list is 5 , the length of the word "list" 4
1. first element from my_list: 1
2. second element of my_tuple: b
3. changing the third element in my_list: [1, 2, 3, 4, 5]
4. first letter in my_str: e
5. last element of my_tuple: c
6. from element 3 to end of my_list: [3, 4, 5]
7. reverse order of my_list and slice: [4, 3, 2, 1]
8. add new element: [1, 2, 3, 4]

Lists also allow for element checks using in (negated by not):

my_list = [1,2,3,4]

if 2 in my_list:
    print('2 is an element of my_list')
    
if 10 in my_list:
    print('10 is an element of my_list')
2 is an element of my_list

Beside definition by hand, the the functions list() and tuple() can be used to transform a suitable object to a list or tuple.

print(tuple(my_list))
print(list('python'))
(1, 2, 3, 4)
['p', 'y', 't', 'h', 'o', 'n']

Dictionaries

Dictionaries allow to store key - value pairs in a kind of named list fashion. By convention the key element is of type string, while the value can be any object (including dictionaries). To define a dictionary, inside curly braces {}, the key element is followed by a colon and the value. The keys must be unique in a dictionary, since the value elements are accessed via the respective key: as in lists, one uses square brackets. Yet not with an index number but the desired key.
New key-value pairs can be added to a dict in a similar way as they are accessed. The new key in square brackets follows the dict name and a value is assigned by an equal sign.
Dictionaries are not ordered, as can be seen from the way of accessing and adding new k-v pairs.

my_dict = {'start': 1, 'end': 20}
print(my_dict['start'])

# new k-v pair
my_dict['mid'] = 10
print(my_dict)
1
{'start': 1, 'end': 20, 'mid': 10}

All keys and value can be accessed with the .keys() and .values() method (more on methods later).

print(my_dict.keys(), my_dict.values())
dict_keys(['start', 'end', 'mid']) dict_values([1, 20, 10])

(Un)packing

Python allows to assign multiple variables at once, called (un)packing. This is most common with tuples but can be expanded to other iterables. The variables to be assigned are separated by a comma.

my_tuple = ('one', 2, 'three')
a, b, c = my_tuple
print(a, b, c)

# the asterisk assigns all surplus values on the right hand side of the equal sign to a
*a, b = 1, 2, 3, 4
print(a ,b)
one 2 three
[1, 2, 3] 4

Loops

for-loop

A for-loop is used for iteration, if the number of iterations is known prior to execution. A for loop iterates over any sequence like lists, tuples, strings etc. A common way is to loop over a range(n) object. Caution: indexing starts from 0 and for range objects ends at n-1!\

Note that when iterating a list or tuple it might be worth considering to choose an informative name (especially when nesting loops).\

To write a loop, for is followed by the iterating variable, in and the sequence to iterate before a colon. The code to execute every step begins in the next line and is indented.

# range object for iterating numbers
for i in range(5):
    print(f'{i} squared is {i**2}')  
print('')

# iterate over a tuple/list
for tuple_element in my_tuple:
    print(tuple_element)
print('')
# iterate over tuple/list element and index
for i, tup_el in enumerate(my_tuple):
    print(f'{tup_el} at position {i}')
print('')  
# iterate over keys and values in dict
for k,v in my_dict.items():
    print(f'key: {k}, value: {v} ')
0 squared is 0
1 squared is 1
2 squared is 4
3 squared is 9
4 squared is 16

one
2
three

one at position 0
2 at position 1
three at position 2

key: start, value: 1 
key: end, value: 20 
key: mid, value: 10 

while-loop

When the number of iterations for a loop is not known beforehand, a while-loop can be used. It will run, until a terminal state is reached or some criterion is satisfied. Usually, an initial state is given which will be altered by some operation and thus lead to termination.
A while-loop start with while followed by the condition and a colon. The condition may be negated with not.\

Warning

Infinite loops may occur when the terminal criterion is not properly defined or the code is otherwise defective.

a = 0
while a < 4:
    print(a)
    # use combined operator a += 1, equal to a = a + 1
    a += 1      # when not including this line, a will forever stay a = 0 and the loop will not terminate by itself
                # What will be printed in this case?
    
var = 5
check = True
while check:
    print(f'{var} is greater zero')
    var -= 1 
    check = var > 0
0
1
2
3
5 is greater zero
4 is greater zero
3 is greater zero
2 is greater zero
1 is greater zero

List comprehension

Python offers a handy way to create lists. It looks like a for-loop in a list and is called list comprehension. It is written in one line instead of indenting as in ordinary for-loops. These expression can also be nested.

list_1 = [i for i in range(5)]
print('list_1:', list_1)

# a nested expression
list_2 = [[i*j for i in list_1] for j in [0,1]]
print('list_2:', list_2)
list_1: [0, 1, 2, 3, 4]
list_2: [[0, 0, 0, 0, 0], [0, 1, 2, 3, 4]]

break and continue

For more control over a loop, the breakand continue statements can be engaged. Used with a condition, break will terminate the loop when satisfied while continue will stop and skip the current iteration to jump to the next.

for i in range(100):
    if i % 2 != 0:    # % is the modulo operator
        continue      # continue in the if-statement skips printing for odd i
    print(f'{i} is even')
    if i == 8:        # when i equals 8, the loop terminates (the print statement for i == 8 is executed before)
        break
0 is even
2 is even
4 is even
6 is even
8 is even

Functions

Python come with many built in functions, some of which have been shown or used before, as well as the option to define new functions.

To define a function, use def function_name(args): before the indented body of the function begins in a new line. args here means arguments, which are passed to function.

Note that a function must not necessarily be defined using arguments.

def my_print(word):
    print(word)
    
my_print('Greetings')

def print_hi():
    print('hi')
    
print_hi()
Greetings
hi

Return

To assign the result of a function to a variable for further use, the return keyword is used. If more than one object is to be returned, use commas to separate them.

The return statement is indented at least once from def, even more than once when using branching, for example. With branching, several return statements may appear in one function.

The return command must not be confused with a print() statement!

import numpy as np

data = [1,2,3,4,5,6,7]
print(np.mean(data))

def my_mean(arg_list):
    sum_ = 0
    length_count = 0
    for el in arg_list:
        sum_ += el
        length_count += 1
    return sum_/length_count

print(my_mean(data))
4.0
4.0

Yield

Besides return, another option is yield. The main difference is that return will do a calculation and send the result back at once, while when using yield a generator object is created and results can be returned sequentially.

Comparing the following two examples, the return_list function stores the complete list in the memory, while the yield_list function does not. Instead, it returns the values one after another (when using the next() function or a loop) remembering the current state of the function.

def return_list(n):
    return [i for i in range(n)]

def yield_list(n):    # function mimics range()
    i = 0
    while i < n:
        yield i
        i += 1
return_list(5)
[0, 1, 2, 3, 4]
yield_list(5)   # creates a generator object
<generator object yield_list at 0x7ff6c0319510>
gen = yield_list(5)    # assign generator object to variable
print(next(gen))    # call next() to jump to next 'yield'
print(next(gen))    # only one value in memory
print(next(gen)) # another next() throws error because the generator is depleted after 4!
0
1
2

Since the state of the generator is remembered and we have already moved beyond 2 using next(), the following for loop ‘finishes’ the generator:

for i in gen:
    print(i)
3
4

Generator object can be useful when working with huge files which do not fit into memory.

For functions with several arguments, the order of inputs is important. They will be read according to the function definition. When the arguments are specified in the function call, the input order does not matter.

def divide(numer, denom):
    return numer / denom

print(divide(10,2))
print(divide(2,10))

print(divide(denom=2, numer=10))
5.0
0.2
5.0

Default values for a function can be set. If an argument is not specified when calling the function, the default value will be used.

def divide(numer, denom = 1):
    return numer / denom
print(divide(numer=10))
10.0

More advanced function writing involves recursion, meaning python allows a function to call itself.

import math

def my_factorial(n):
    if n == 0:
        return 1
    else:
        return n * my_factorial(n-1)
    
print(my_factorial(5))
print(math.factorial(5))
120
120

Global and Local Variables

Variables can be defined globally, i.e. outside of functions. A variable defined inside a function will only exist inside the scope of the function. Should a local variable be given the same name as a global variable, the function will use the value locally defined!
Global variables inside functions can be defined using the global keyword before the respective variables.

x = 7
def f_1():
    print('calling f_1, x =',x)

f_1()
    
def f_2():
    x = 10

f_2()
print('after calling f_2: x =', x)
    
def f_3():
    x = 10
    print('calling f_3: x =', x)
    
f_3()
print('after calling f_3: x =', x)
    
def f_4():
    global x
    x = 10

f_4()
print('after calling f_4: x =', x)
calling f_1, x = 7
after calling f_2: x = 7
calling f_3: x = 10
after calling f_3: x = 7
after calling f_4: x = 10

Lambda functions

It might occur that a function is needed which is only needed once and has only a limited functionality. To spare you and the reader of the code from jumping to a block of defining such a function you can use a lambda function. They are written in-line by the keyword lambda, followed by the parameters and a colon before the body.
Arguments are passed in parentheses as usual.

Note

The lambda syntax should only be used for simple functions!

from math import exp

# regular way
def reg_func(x):
    return 1/exp(x)
print(reg_func(.5))

#lambda function with identifier 
l_func = lambda x: 1/exp(x)        # this way, it is hard to find the origin of a function if it is called elsewhere
print(l_func(.5))                  # in your code. It is not recommended, yet still possible.

# for single use, no identifier
print((lambda x: 1/exp(x))(.5) )
0.6065306597126334
0.6065306597126334
0.6065306597126334

Lambda functions can be easily applied in combination with list comprehension, where the can be defined in place.

# with list comprehension
l_list = [(lambda x: 1/exp(x))(i) for i in list_1]
print(l_list)

# always consider doing it without a lambda function
print([(1/exp(i)) for i in list_1])
[1.0, 0.36787944117144233, 0.1353352832366127, 0.049787068367863944, 0.018315638888734182]
[1.0, 0.36787944117144233, 0.1353352832366127, 0.049787068367863944, 0.018315638888734182]

Callbacks

A callback is a function, which is run as soon as some criterion is met, usually when some other task is finished to use this result in further computations. One example might be a file which needs to be imported and subsequently transformed in some way, but only as the import is complete. When defining callbacks, we make use of python’s property that anything is an object. So, here we pass a function as an argument to another function. The basic syntax when defining functions is the same as before.

The following is a basic example for a callback. The first function enter_string will print the string provided as argument. The string will then be handled depending on the callback: the length of the string will be printed or the string will be printed in reverse. The if clause in enter_string together with a default value of None allows us to omit a callback.

def enter_string(string, callback=None):
    print(string)
    if callback:
        callback(string)
    
def print_len(string):
    print(len(string))
    
def reverse_string(string):
    print(string[::-1])
    
enter_string('onomatopoeia', callback=print_len)

enter_string('stressed', callback=reverse_string)

enter_string('no callback here')
onomatopoeia
12
stressed
desserts
no callback here

Decorators

Functions which expand the functionality of another function are called decorators (or wrappers). However, these decorators do not change the underlying function, i.e. the one that gets decorated.

This is possible since functions, just like anything in python, are objects and can get passed as arguments to other functions (we have seen this in the context of callbacks). Furthermore, functions may be defined inside other functions, using the the same syntax as usual.

Let’s look at a simple example of a function and a wrapper/decorator. The initial function text_to_wrap simply prints a string. The decorator returns the wrapper() function, which adds a line above and below the text, when printing.

def text_to_wrap():
    print('my text')
    
text_to_wrap()
my text
def emphasize_decorator(func):
    def wrapper():
        print('##################')
        func()
        print('!!!!!!!!!!!!!!!!!!')
    return wrapper

To decorate text_to_wrap, we can assign the decorator with ‘text_to_wrap’ as argument to the initial function name. Note that we pass the function name without parentheses.

text_to_wrap = emphasize_decorator(text_to_wrap)
text_to_wrap()
##################
my text
!!!!!!!!!!!!!!!!!!

To shorten this procedure, python includes a special syntax for decorators. With @emphasize_decorator (no parentheses!) before the definition of the inner function, we can achieve the same behaviour.

# emphasize_decorator is being treated as already defined

@emphasize_decorator
def print_greeting():
    print('Hello')
print_greeting()
##################
Hello
!!!!!!!!!!!!!!!!!!

We see, that print_greeting() has initially been defined to print ‘Hello’. With the decorator call using @, however, we have decorated it on the fly to add the emphasis lines around the text from emphasize_decorator().

Furthermore, decorators may be chained by subsequent calls of the decorator functions with the same @-syntax before the definition of the inner function. This example uses the same decorator twice, which is just a special case. Any decorators can be chained.

@emphasize_decorator
@emphasize_decorator
def print_greeting():
    print('Hello')

To pass arguments through the decorator, we can use *args and **kwargs as placeholder for an arbitrary number of positional arguments and keyword arguments. Note that this is not unique for decorators, but can be used for the definition of any function! We will now update the first two examples from above: - text_to_wrap will get two arguments to print - wrapper inside emphasize_decorator will get the placeholders in it’s definition

def emphasize_decorator(func):
    def wrapper(*args, **kwargs):
        print('##################')
        func(*args, **kwargs)
        print('!!!!!!!!!!!!!!!!!!')
    return wrapper
@emphasize_decorator
def text_to_wrap(w1, w2):
    print(f"{w1}\n{w2}")
text_to_wrap('line1', 'line2')
##################
line1
line2
!!!!!!!!!!!!!!!!!!

Classes

Since almost everything in python is an object, classes are a very important element to its functionality. A class can be seen as a constructor for certain objects. \

To define a class, the class keyword is followed by the class name. By convention, for class names the CamelCase style is used.

# define class Student

class Student:
    uni = 'Passau'
    subject = 'math'
    grades = [1.3, 1.7, 3.0]
        
# instantiate object  
Chris = Student
Tina = Student

# get uni of student Chris
print('Uni:', Chris.uni)

# student changes subject
Chris.subject = 'biology'

print("Chris' subject:", Chris.subject)
print('class name:', Chris.__name__)

print("Tina's subject:", Tina.subject)
Uni: Passau
Chris' subject: biology
class name: Student
Tina's subject: biology

Classes can not only store values but also functions, called methods. Methods are defined just like regular functions, but inside the class body. To make a class more useful (compared to the example above), the __init__() method is needed. As arguments, it takes self and all other arguments needed as input to build the object. self is used for instance variables, i.e. variables that belong to an object and not the whole class.

Methods and class variables are chained by a . to the object.

class Student:
    def __init__(self, uni, subject, grades):
        self.uni = uni
        self.subject = subject
        self.grades = grades
    
    #define method to show average grade
    def avg_grade(self):
        return np.mean(self.grades)
     
Chris = Student('Passau', 'Art', [1.3, 1.7, 3.0])
Tina = Student('Regensburg', 'Physics', [1.0, 2.0])

Chris.subject = 'engineering'
print(f'Chris: {Chris.subject}')
print(f'Tina: {Tina.subject}')

print(Chris.avg_grade())

# lists have a built-in method append, which adds items given as arguments to the end of the list (see help(list))
Chris.grades.append(5.0)
print(Chris.avg_grade())
Chris: engineering
Tina: Physics
2.0
2.75

Inheritance

Classes can inherit all properties from other classes by using the super() function in the constructor. This should be applied, if a new class should expand the functionality of the original class without wanting to change the original class. For example if a Student is also a resident:

class LocalResident(Student):
    def __init__(self, uni, subject, grades, address):
        super().__init__(uni, subject, grades)
        self.address = address
        
Chris = LocalResident('Passau', 'Art', [1.3, 1.7, 3.0, 5.0], 'Innstr. 27')

print(f" Uni and address: {Chris.uni}, {Chris.address}")
print(type(Chris))
 Uni and address: Passau, Innstr. 27
<class '__main__.LocalResident'>

Files

Regular python files end with .py and code can be written and read in any text editor. Only the suffix will tell the interpreter how to handle the file, in this case as a python script.

Jupyter Notebook files end with .ipynb. The special cell-wise structure leads to more formatting effort which is stored in a .json format (see later chapters), meaning that when a .ipynb file is opened in an ordinary text editor, the structure will be very different from what is shown in this jupyter interface.

Below a jupyter notebook is shown on the left. The formatting is seen on the right, where the .ipynb file was opened using a text editor.

Second steps

In the following, some common operations are shown in order to get used to the python language und its functionality.

NumPy

One of the most useful and widely used libraries is NumPy. It makes working with arrays, and thus vectors and matrices, very efficient and includes a broad variety of mathematical tools. Because of its powerful implementations, it serves many other packages as a basis.

Even though this course will hardly rely on numpy itself, a short introduction is given in the following.

We will start with importing as np. Information will mainly be given by comments in the code.

import numpy as np

# mathematical functions:
print('sin(pi) = ',np.sin(np.pi)) 
# note that the sine of the number pi is defined as zero, yet python returns a small number > 0
sin(pi) =  1.2246467991473532e-16

One of the most useful concepts is that of arrays, which can have an arbitrary number of dimensions (yet three should do here). Arrays correspond mainly to what you should know as scalar, vector or matrix.

arr_a = np.array([1,2,3,4,5])
arr_b = np.array([0,0,0,0,1])

#dimensions of an array
print('0. dimension of array a', arr_a.shape)

# operations can be performed by methods and methods chaining on these array objects
print('1. sum of array a:',arr_a.sum())
print('2. variance of array a:', arr_a.var())

# broadcasting a scalar to an array
print('3. one plus array b:', 1 + arr_b)
print('4. element-wise multiplication of arrays:', arr_a * arr_b)

# vector multiplication / dot product
print('5. scalar by dot product:', arr_a.transpose().dot(arr_b))   # this should not yield a scalar for vectors!

# for real vectors, the inner dimension of the product is crucial -> (5x5)
print('5.1 vector multiplication: \n', np.matrix(arr_a).transpose().dot(np.matrix(arr_b)))

# shapes differ for arrays and matrices
print('5.2 transposed array:', arr_a.transpose().shape, 'transposed matrix:', np.matrix(arr_a).transpose().shape )

# to reshape an array, use the reshape method
zero_arr = np.zeros(shape=(2,3))
print('6. array:     reshaped:           same result:\n', 
      zero_arr, zero_arr.reshape(-1,), zero_arr.reshape(6,))  
# -1 is for "unknown" dim (like a wildcard)
0. dimension of array a
 (5,)
1. sum of array a: 15
2. variance of array a: 2.0
3. one plus array b: [1 1 1 1 2]
4. element-wise multiplication of arrays: [0 0 0 0 5]
5. scalar by dot product: 5
5.1 vector multiplication: 
 [[0 0 0 0 1]
 [0 0 0 0 2]
 [0 0 0 0 3]
 [0 0 0 0 4]
 [0 0 0 0 5]]
5.2 transposed array: (5,) transposed matrix: (5, 1)
6. array:     reshaped:           same result:
 [[0. 0. 0.]
 [0. 0. 0.]] [0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0.]
# random number generation with np.random module

# for a list of probability distributions, the autocomplete function can be used (press tab after the dot)
random_expo = np.random.exponential(scale=.5, size=(5,5))
print(random_expo)
[[1.47470581e-01 5.31335731e-02 4.96493019e-03 5.87555323e-01
  3.42546506e+00]
 [5.27133278e-02 3.67601997e-01 9.62705621e-01 4.74340273e-01
  1.23891774e-01]
 [6.57889755e-01 8.75486230e-01 6.04786796e-01 9.43935440e-01
  1.29252533e+00]
 [3.44523767e-03 2.64355819e-01 3.48651056e-02 4.16541704e-01
  1.65012325e-03]
 [2.76619159e-01 8.92062502e-01 7.67249908e-02 5.27413305e-01
  3.86930040e-01]]

Working with Data

Several libraries offer useful tools to work with data in order to allow for a meaningful analysis. One of the most popular and powerful is Pandas. Beside including efficient ways for cleaning and manipulating data, pandas also includes functions for statistical analysis and graphics.

Usually, pandas is imported under the alias pd.

import pandas as pd

Pandas - DataFrames and Series

Indexing

The basic elements for data are DataFrames and Series. A DataFrame is a whole matrix- or table-like representation of data with column and row names. A Series can be understood as a single column of such a data matrix (but without the need for a table).
There are respective functions to turn other objects, e.g. lists or dicts, into DataFrames or Series. Indexing, similar to lists or dicts, uses square brackets.

my_list = [1,2,3,4,5,6,7]
my_df = pd.DataFrame(my_list, columns=['var1'])
print('df:\n', my_df)

my_series = pd.Series(my_list)
print('series:\n',my_series)

# selecting a single column from a DataFrame
print('select column from df:\n', my_df['var1'])
df:
    var1
0     1
1     2
2     3
3     4
4     5
5     6
6     7
series:
 0    1
1    2
2    3
3    4
4    5
5    6
6    7
dtype: int64
select column from df:
 0    1
1    2
2    3
3    4
4    5
5    6
6    7
Name: var1, dtype: int64

To select specific rows or columns, the iloc method, for selecting based on an index, and loc method, based on labels, are recommended. Especially when several columns are to be selected. Indexing can also be done by boolean Series (or lists) and thus conditionally.
Another way to select a single column is by chaining the column’s name to the DataFrame’s name by a dot (like in method chaining).

my_df = pd.DataFrame(
{'age': [20, 34, 56],
 'height': [183, 179, 172]
}, index=['person_a', 'person_b', 'person_c'])
print(my_df)
print('1.:', my_df.loc['person_b','age'], 'is the same as',  my_df.iloc[1,0])

# age > 27
print('indexing by condition/list\n', my_df.loc[my_df.age >27], '\ncorresponds to \n', my_df.loc[[False, True, True]])
print(type(my_df.age >27))
          age  height
person_a   20     183
person_b   34     179
person_c   56     172
1.: 34 is the same as 34
indexing by condition/list
           age  height
person_b   34     179
person_c   56     172 
corresponds to 
           age  height
person_b   34     179
person_c   56     172
<class 'pandas.core.series.Series'>

Useful Methods

Pandas includes many useful methods that will help you get to know and manipulate a dataset. Some of these methods are shown in the following, others are introduced later when needed.
More often than not, a dataset will contain missing values, i.e. cells in a data table contain no value. They will be depicted as NaN, Not a Number.

import numpy as np
my_df =  pd.DataFrame(
{'age': [20, 34, 56, np.nan, 44],
 'height': [183, 179, np.nan,  163, np.nan]
})
my_df
age height
0 20.0 183.0
1 34.0 179.0
2 56.0 NaN
3 NaN 163.0
4 44.0 NaN
# view the first rows (view last rows with .tail())
print('0.\n', my_df.head(n=5))


# general information
print('\n1.')
my_df.info()

# descriptive statistics on dataset
print('\n2.\n',my_df.describe())

# number of missing values per column
print('\n3.\n',my_df.isnull().sum())

# single statistics are included as methods, also for single columns
print('\n4.\n', my_df.age.mean())

# fill missing values (e.g. with mean of column)
print('\n 5.\n', my_df.fillna(my_df.mean()))    

# note that you must assign this to my_df (or a different variable) in order to impute missing values permanently!
my_df = my_df.fillna(my_df.mean())

# sort values by column(s)
print('\n6.\n', my_df.sort_values(by=['height']))    
0.
     age  height
0  20.0   183.0
1  34.0   179.0
2  56.0     NaN
3   NaN   163.0
4  44.0     NaN

1.
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   age     4 non-null      float64
 1   height  3 non-null      float64
dtypes: float64(2)
memory usage: 208.0 bytes

2.
              age      height
count   4.000000    3.000000
mean   38.500000  175.000000
std    15.264338   10.583005
min    20.000000  163.000000
25%    30.500000  171.000000
50%    39.000000  179.000000
75%    47.000000  181.000000
max    56.000000  183.000000

3.
 age       1
height    2
dtype: int64

4.
 38.5

 5.
     age  height
0  20.0   183.0
1  34.0   179.0
2  56.0   175.0
3  38.5   163.0
4  44.0   175.0

6.
     age  height
3  38.5   163.0
2  56.0   175.0
4  44.0   175.0
1  34.0   179.0
0  20.0   183.0
# get column names (useful for looping)
print('7.\n', my_df.columns)

# drop rows containing missing values
print('8.\n', my_df.dropna()) 

# drop rows or columns
print('9.\n', my_df.drop(['age'], axis=1))

# merge DataFrames (automatically on shared variable if not specified otherwise)
df2 = pd.DataFrame(
{'age': [20, 34, 56, np.nan, 44],
 'weight': [83, 63, 98,  50, 77]
})
print('10.\n', my_df.merge(df2))
my_df = my_df.merge(df2)

# correlation matrix
print('11.\n', my_df.corr())

# adding new columns
my_df = my_df.assign(bmi = my_df.weight/(my_df.height/100)**2)
my_df
7.
 Index(['age', 'height'], dtype='object')
8.
     age  height
0  20.0   183.0
1  34.0   179.0
2  56.0   175.0
3  38.5   163.0
4  44.0   175.0
9.
    height
0   183.0
1   179.0
2   175.0
3   163.0
4   175.0
10.
     age  height  weight
0  20.0   183.0      83
1  34.0   179.0      63
2  56.0   175.0      98
3  44.0   175.0      77
11.
              age    height    weight
age     1.000000 -0.946549  0.481176
height -0.946549  1.000000 -0.282126
weight  0.481176 -0.282126  1.000000
age height weight bmi
0 20.0 183.0 83 24.784258
1 34.0 179.0 63 19.662308
2 56.0 175.0 98 32.000000
3 44.0 175.0 77 25.142857

As a last tool in this section, we will look at the get_dummies() function. Dummy variables are used to encode categorical variables with zero and one, for example in order to calculate the correlation with some other numerical variable.

df3 = pd.DataFrame(
{'hair': ['blonde', 'black', 'red', 'red', 'black']
})

print(pd.get_dummies(df3.hair))
   black  blonde  red
0      0       1    0
1      1       0    0
2      0       0    1
3      0       0    1
4      1       0    0

Plots

Methods for standard pot types are available. For a histogram of the data, just use .hist(). Other types are available by chaining .plot. and the plot type.

# histogram
my_df.hist()
array([[<AxesSubplot:title={'center':'age'}>,
        <AxesSubplot:title={'center':'height'}>],
       [<AxesSubplot:title={'center':'weight'}>,
        <AxesSubplot:title={'center':'bmi'}>]], dtype=object)
_images/Python_introduction_126_1.png
# lineplot
my_df.sort_values(by='age').plot.line(x='age', y='height')
<AxesSubplot:xlabel='age'>
_images/Python_introduction_127_1.png
# scatter plot
my_df.plot.scatter(x='age', y='weight')
<AxesSubplot:xlabel='age', ylabel='weight'>
_images/Python_introduction_128_1.png

Importing and Exporting Data

Your data may come to you in various file formats. Pandas enables you to import data from all common formats. The respective functions are usually called read_ and to_ followed by the respective file type.

To read a .csv for example, use the read_csv() function. Note that the file must not be stored locally on your computer.

# import from csv
import pandas as pd
dax = pd.read_csv('data/DAX.csv')
print(dax.head(3))
print(dax.tail(3))
         Date          Open          High           Low         Close  \
0  2020-03-10  10724.980469  11032.290039  10423.900391  10475.490234   
1  2020-03-11  10601.849609  10761.429688  10390.509766  10438.679688   
2  2020-03-12   9863.990234   9932.559570   9139.120117   9161.129883   

      Adj Close     Volume  
0  10475.490234  267400800  
1  10438.679688  216708900  
2   9161.129883  390477000  
           Date          Open          High           Low         Close  \
251  2021-03-08  14024.570313  14402.919922  13977.129883  14380.910156   
252  2021-03-09  14345.509766  14475.650391  14309.349609  14437.940430   
253  2021-03-10  14439.450195  14554.490234  14408.519531  14528.570313   

        Adj Close     Volume  
251  14380.910156  109071900  
252  14437.940430  107881800  
253  14528.570313          0  
# save data frame to excel
dax.to_excel('data/DAX.xlsx')

Lets do some exploration and manipulation of the historical data from the DAX index we just imported. $ $

print('shape:', dax.shape)
shape: (254, 7)
dax.info()    # the 'Date' column is of dtype object 
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 254 entries, 0 to 253
Data columns (total 7 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Date       254 non-null    object 
 1   Open       254 non-null    float64
 2   High       254 non-null    float64
 3   Low        254 non-null    float64
 4   Close      254 non-null    float64
 5   Adj Close  254 non-null    float64
 6   Volume     254 non-null    int64  
dtypes: float64(5), int64(1), object(1)
memory usage: 14.0+ KB
# check type of first entry in 'Date'
print(type(dax.Date[0]))
<class 'str'>

Transform it to datetime, a special type for dates in python.

dax['Datetime'] = pd.to_datetime(dax.Date)
print(dax.Datetime.head(3))    # check dtype now
0   2020-03-10
1   2020-03-11
2   2020-03-12
Name: Datetime, dtype: datetime64[ns]
print(dax.columns)
Index(['Date', 'Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume',
       'Datetime'],
      dtype='object')
print(f'of {len(dax)} rows:\n{dax.notna().sum()}')
print('')
print(f'makes a total of {dax.isnull().sum().sum()} missing values')
of 254 rows:
Date         254
Open         254
High         254
Low          254
Close        254
Adj Close    254
Volume       254
Datetime     254
dtype: int64

makes a total of 0 missing values
dax.plot(x='Datetime', y=['Open', 'Close'])    # using Datetime for plotting
<AxesSubplot:xlabel='Datetime'>
_images/Python_introduction_142_1.png
dax.describe()
Open High Low Close Adj Close Volume
count 254.000000 254.000000 254.000000 254.000000 254.000000 2.540000e+02
mean 12488.785944 12595.082573 12373.931187 12489.142640 12489.142640 9.469272e+07
std 1324.154858 1295.824891 1365.892066 1331.191733 1331.191733 4.941752e+07
min 8495.940430 8668.480469 8255.650391 8441.709961 8441.709961 0.000000e+00
25% 11896.257813 12185.372315 11850.512451 12031.852783 12031.852783 6.409985e+07
50% 12847.964844 12945.129883 12764.479981 12851.169922 12851.169922 7.875350e+07
75% 13314.162353 13362.944824 13224.792481 13292.120361 13292.120361 1.085247e+08
max 14439.450195 14554.490234 14408.519531 14528.570313 14528.570313 3.904770e+08

For statistics on one variable, index the result as usual.

mean_open = dax.describe().loc['mean', 'Open']
print(mean_open)
12488.785944444882

Create a new column, with a flag if the closing price was higher than the opening price.

dax = dax.assign(positive = dax.Close > dax.Open)
print(dax.head(3))

print('')
# fraction of days when this was the case
print('fraction of positive days:', dax.positive.mean())
print('\ncheck: \n', dax.positive.value_counts())
         Date          Open          High           Low         Close  \
0  2020-03-10  10724.980469  11032.290039  10423.900391  10475.490234   
1  2020-03-11  10601.849609  10761.429688  10390.509766  10438.679688   
2  2020-03-12   9863.990234   9932.559570   9139.120117   9161.129883   

      Adj Close     Volume   Datetime  positive  
0  10475.490234  267400800 2020-03-10     False  
1  10438.679688  216708900 2020-03-11     False  
2   9161.129883  390477000 2020-03-12     False  

fraction of positive days: 0.5

check: 
 False    127
True     127
Name: positive, dtype: int64

Extract same fraction for every day in the week. Days are counted from 0 (Monday) to 6 (Sunday).

for i in range(7):
    print(f'day {i}: ', dax[dax.Datetime.dt.dayofweek == i].positive.mean())
day 0:  0.58
day 1:  0.49056603773584906
day 2:  0.5283018867924528
day 3:  0.44
day 4:  0.4583333333333333
day 5:  nan
day 6:  nan

A more straight forward way using built-in methods.

dax = dax.assign(wday = dax.Datetime.dt.dayofweek)
dax.groupby(['wday']).mean()  # rows with nans are not calculated
Open High Low Close Adj Close Volume positive
wday
0 12510.387852 12640.970996 12400.179629 12546.658457 12546.658457 8.860451e+07 0.580000
1 12523.045843 12621.270969 12403.764906 12520.457547 12520.457547 9.450496e+07 0.490566
2 12522.144310 12628.387014 12414.166403 12527.621886 12527.621886 9.062774e+07 0.528302
3 12422.626602 12512.666523 12286.600566 12394.369766 12394.369766 9.770017e+07 0.440000
4 12460.538106 12567.442179 12360.190674 12450.887695 12450.887695 1.025976e+08 0.458333

SciKit Learn

The sklearn package provides a broad collection of data analysis and machine learning tools. These tools basically cover the whole process from data manipulation over fitting different models to the data and evaluating the results. Sklearn is based on numpy and may thus require that the data to analyse is provided as an numpy array. Results will also be returned as such arrays.

The respective modules, i.e. classes and functions of different algorithms, are provided in an API which allows easy application basically without requiring knowledge about how the algorithm itself works. It goes without mentioning, that you should have some basic understanding about the particular strengths of an algorithm and any caveats when fitting a model to extract meaningful results.

In this course, we will present only a small portion of sklearn’s functionality and refer to the API’s documentation.

Sklearn provides a kind of framework syntax for fitting models of different kinds:

  • instantiate the respective (algorithm’s) object. Here, the hyper parameters and options are set

  • fit the data using this object

  • evaluate the model or use the model for prediction

We will now look at some basic procedures for analysing data using different methods.

To load our modules, we import them specifically from the skikit-learn (name for installation) package instead of importing the whole package.

Data manipulation

When working with real data, the scale of your variables may differ dramatically. However, some algorithms work numerically more stable when the feature values are set in the same range. Some types of regression for example even require rescaling in order to work properly.

Sklearn provides several scaler classes to map all values to a desired range. Note however, that such rescaling may affect the interpretation of the results.

Let’s have a look at the StandardScaler which is part of the Preprocessing module. It assumes a normal distribution of features. The rescaling is done by subtracting the mean and dividing by the standard deviation.

from sklearn.preprocessing import StandardScaler
import numpy as np
import matplotlib.pyplot as plt

In the example below, we create an array filled with 1000 numbers drawn from a normal distribution with 2 two and standard deviation of 5.

Then, we instantiate the StandardScaler object.

Subsequently, we use its .fit() method, passing the simulated data as input. Note here the use of the array’s .reshape() method, as StandardScaler objects require a 2D array.

At last, we create the rescaled data using .transform().

# synthetic data
x = np.random.normal(2,5,10000000)
print(f"x: mean = {x.mean()}, std = {x.std()}")

scaler = StandardScaler()
scaler.fit(x.reshape(-1,1))
x_rescaled = scaler.transform(x.reshape(-1,1))

print(f"x rescaled: mean = {x_rescaled.mean()}, std = {x_rescaled.std()}")

plt.subplot(1,2,1)
plt.hist(x)
plt.subplot(1,2,2)
plt.hist(x_rescaled)
plt.show()
x: mean = 1.9989688394057843, std = 5.000619417270245
x rescaled: mean = -4.386890850582859e-16, std = 0.9999999999999997
_images/Python_introduction_156_1.png

We see the results printed above, with a mean of zero and a standard deviation of 1 (with a slight numerical deviation). From the histograms, we can also see that the transformed distribution on the right hand side is centered around zero, i.e. the mean is indeed shifted to zero.

It is important to disambiguate the two occurrences of x.reshape(-1,1) as argument to the methods:

  • fit(): the scaler ‘learns’ the parameters from the data, meaning it calculates the mean and std for this data specifically

  • transform(): the scaler applies the parameters calculated in the step before to center and rescale the data in its argument. The parameters are fixed for the scaler object, as long as .fit() is not called again on different data.

This means, that we can apply the same fitted parameters to another dataset using .transform() on this new data, as shown below.

x_alt = np.random.normal(2,5,10000000)
x_alt_scaled = scaler.transform(x_alt.reshape(-1,1))

print(f"x_alt: mean = {x_alt_scaled.mean()}, std = {x_alt_scaled.std()}")
x_alt: mean = 0.0002722015390079619, std = 0.9996064685986829

At last, let’s check what the fitted parameters are and do the ‘transform’ calculation ourselves to compare the result.

Fitted parameters in sklearn are saved as attributes to the fitting object, in this case scaler. We access them by scaler.mean_ and scaler.scale_. The mean of our calculation here reproduces the mean from the cell using the .transform() method.

calc_result = x_alt - scaler.mean_
calc_result /= scaler.scale_

print(calc_result.mean())

assert calc_result.mean() == x_alt_scaled.mean()
0.0002722015390079619

When we want to transform the data without having to keep the original data, we can use the shortcut method of .fit_transform(), which combines the two methods used before into one. The results are the same as for using first .fit() and then transform().

x_ft = scaler.fit_transform(x.reshape(-1,1))
print(f"x_alt: mean = {x_ft.mean()}, std = {x_ft.std()}")
x_alt: mean = -4.386890850582859e-16, std = 0.9999999999999997

There are more scaler types available in sklearn which can be applied using the same syntax as shown here.

Regression

Linear Regression

A popular and widely used tool in data analysis is linear regression where we model the the influence of some independent numerical variables on the value of a dependent numerical variable (the target).

We find the ordinary least squares regression readily available in sklearn’s module linear_model as the LinearRegression class. An important note here is that by default, sklearn will fit an intercept. To exclude the intercept, set fit_intercept=False when instantiating the LinearRegression object.

We will use a health insurance data set with several variables, which we will use to try to explain the charges taken for the policy. We do have some categorical variables, for which we would need to create dummy variables to use in the regression directly.

import pandas as pd
df = pd.read_csv("data/insurance.csv")
#df = df.select_dtypes(include=['int64', 'float64'])
df.head(2)
age sex bmi children smoker region charges
0 19 female 27.90 0 yes southwest 16884.9240
1 18 male 33.77 1 no southeast 1725.5523

To perform a simple linear regression, we proceed similar to above:

  • transform the pandas Series for independent and dependent variable to a numpy array, which we reshape to 2D

  • instantiate the object (default option fit_intercept=True included for illustrative reasons)

  • call the .fit() method with positional arguments: first independent variable X, then y

At this point, the regression is done and we can access the parameters for the intercept and the variable’s coefficient. Again, the syntax is similar as above for the StandardScaler.

In this example, we regress the charges for a policy on the body mass index of the customer.

from sklearn.linear_model import LinearRegression

X = np.array(df.bmi).reshape(-1,1)
y = np.array(df.charges).reshape(-1,1)

linreg = LinearRegression(fit_intercept=True)
linreg.fit(X,y)

print(f"intercept: {linreg.intercept_}, coefficient: {linreg.coef_}")
intercept: [1192.93720896], coefficient: [[393.8730308]]

What we find is a positive slope, linreg.coef_ > 0, meaning that a higher bmi results in higher charges. To be precise, if your bmi increases by one unit, you will, on average, be charged 394 units more.

We can also see that the estimated parameters are returned as (nested) arrays. To get to the values, we must hence extract them accordingly.

print(f"intercept: {linreg.intercept_[0]}, coefficient: {linreg.coef_[0][0]}")
intercept: 1192.9372089611497, coefficient: 393.87303079739524

Now, we can generate the regression line and plot it together with the data to get a visualisation of the regression results.

To do so, we simply pass the X values to the .predict() method and save the result as y_pred. We then use matplotlib to create a scatter plot of the data and a line plot of the regression line.

y_pred = linreg.predict(X)

plt.scatter(X, y)
plt.plot(X,y_pred, color='red')
plt.show()
_images/Python_introduction_172_0.png

In the case above, we used the data “seen” for prediction. We can easily extrapolate this to unseen data, by simply predicting for a larger value range.

x_plot = np.linspace(0,80).reshape(-1,1)
y_extrapol = linreg.predict(x_plot)

plt.scatter(X, y)
plt.plot(x_plot,y_extrapol, color='red')
plt.show()
_images/Python_introduction_174_0.png

While we can clearly identify the trend of higher charges for older customers we can also see what appears to be three different groups. We will try to go into a little more detail about this later.

To evaluate, how well our model explains the data, we can calculate different measures. A quite common one is the coefficient of determination:

$$R^2 = 1 - \frac{\sum_i(y_i - \hat{y_i})^2}{\sum_i(y_i - \bar{y})^2}$$

With the predicted value $\hat{y_i}$ and the mean $\bar{y}$.

Thanks to sklearn, we do not have to calculate it from the results ourselves but find it in sklearn.metrics. To the r2_score() function, we pass as arguments the true y data and the prediction.

from sklearn.metrics import r2_score

r2_slr = r2_score(y, y_pred)
print(r2_slr)
0.03933913991786264

The result of about 4% is hardly satisfying and neither is the fit in the plot we saw above. From what we see in the graphic, the data may consist of different groups, one at the bottom and one stretching out on top. We will try to identify if we can identify them by variables in the data using seaborn’s hue option.

As a first guess, we can look at ‘smoker’, since smokers can be expected to be charged higher.

import seaborn as sns

sns.scatterplot(x=df.bmi, y=df.charges, hue=df.smoker)
plt.show()
_images/Python_introduction_178_0.png

In fact, we see a rather clear distinction between smokers and non-smokers indicating that you will be charged higher in general when you smoke. It also appears that for smokers, the increase in charges is steeper for an increasing bmi than it is for non-smokers

We now split the data accordingly and run a regression for each group to see if we can verify these assumptions.

X_smoker = np.array(df[df.smoker =='yes']['bmi']).reshape(-1,1)
y_smoker = np.array(df[df.smoker =='yes']['charges']).reshape(-1,1)

X_non_smoker = np.array(df[df.smoker =='no']['bmi']).reshape(-1,1)
y_non_smoker = np.array(df[df.smoker =='no']['charges']).reshape(-1,1)

linreg_smoker = LinearRegression(fit_intercept=True)
linreg_smoker.fit(X_smoker,y_smoker)

linreg_non_smoker = LinearRegression(fit_intercept=True)
linreg_non_smoker.fit(X_non_smoker,y_non_smoker)

print(f"smoker: \nintercept: {linreg_smoker.intercept_}, coefficient: {linreg_smoker.coef_}")
print(f"non-smoker: \nintercept: {linreg_non_smoker.intercept_}, coefficient: {linreg_non_smoker.coef_}")
smoker: 
intercept: [-13186.57632276], coefficient: [[1473.1062547]]
non-smoker: 
intercept: [5879.42408187], coefficient: [[83.35055766]]

We see that the coefficient for the smokers is indeed much higher than for non-smokers. Regarding the intercept, we need to be aware, that it gives the value at bmi=0, which we should not consider relevant in reality.

Let’s also have a look at the plots.

y_pred_sm = linreg_smoker.predict(X_smoker)
y_pred_non = linreg_non_smoker.predict(X_non_smoker)

plt.scatter(X_smoker, y_smoker,)
plt.plot(X_smoker,y_pred_sm, color='red')
plt.scatter(X_non_smoker, y_non_smoker, color='green')
plt.plot(X_non_smoker,y_pred_non, color='orange')
plt.show()
_images/Python_introduction_182_0.png

Finally, the $R^2$

r2_slr_sm = r2_score(y_smoker, y_pred_sm)
r2_slr_non = r2_score(y_non_smoker, y_pred_non)
print(f"smokers: {r2_slr_sm}, \nnon-smokers: {r2_slr_non}")
smokers: 0.6504109694921547, 
non-smokers: 0.007062140580960441

Digression: statsmodels

At this point, we shortly look at an important point when it comes to regression: significance. We skipped this above, because sklearn has its shortcomings when it comes to this kind of ‘classical statistics’. The calculation of p-values for example is not included and must be implemented by oneself.

For such purposes, we can switch to the statsmodels package which may be a better choice to do all types of regression modelling as it comes with a broad output of test statistics and performance metrics.

Below, an example is given for the regression from above with the just slightly different statsmodels syntax. We see that using the .summary() method automatically outputs a table of information about the model. We can now, for example, look at the significance of the coefficients.

import statsmodels.api as sm

X_smoker = sm.add_constant(X_smoker)
X_non_smoker = sm.add_constant(X_non_smoker)

stats_smoker = sm.OLS(y_smoker, X_smoker)
res_smoker = stats_smoker.fit()
print(res_smoker.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.650
Model:                            OLS   Adj. R-squared:                  0.649
Method:                 Least Squares   F-statistic:                     506.1
Date:                Thu, 14 Oct 2021   Prob (F-statistic):           5.02e-64
Time:                        14:44:41   Log-Likelihood:                -2807.2
No. Observations:                 274   AIC:                             5618.
Df Residuals:                     272   BIC:                             5626.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const      -1.319e+04   2052.885     -6.423      0.000   -1.72e+04   -9145.013
x1          1473.1063     65.484     22.496      0.000    1344.187    1602.026
==============================================================================
Omnibus:                       24.771   Durbin-Watson:                   1.878
Prob(Omnibus):                  0.000   Jarque-Bera (JB):               38.639
Skew:                           0.568   Prob(JB):                     4.07e-09
Kurtosis:                       4.447   Cond. No.                         156.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

We can access the quantities from the table specifically, using the names listed when running dir(res_smoker).

stats_non_smoker = sm.OLS(y_non_smoker, X_non_smoker)
res_non_smoker = stats_non_smoker.fit()
print(np.round(res_non_smoker.pvalues,3))
[0.    0.006]

Supervised learning: Random Forests

The broad field of machine learning can be subdivided into supervised learning, unsupervised learning and reinforcement learning. For the first two, sklearn offers a variety of algorithms as plug and play functions. We will look at one example of supervised learning while introducing some basic concepts.

We will train a classifier, i.e. a model that will predict a class for each observation. As we saw above, smokers and non smokers are separated quite well in the data. We would thus expect a model to learn to distinguish between those two classes.

We will use a Random Forest model which basically consists of a collection of randomized, decorrelated decision trees. It is a powerful algorithm for classification and can be used out of the box in sklearn.

In the following, we

  • import the RandomForestClassifier for which we can use the usual sklearn .fit() and .predict() syntax

  • import the train_test_split() function to create two separate sub-datasets from the original. This is a usual procedure in machine learning to check whether the model can generalize its prediction to new data. It randomly divides our X and y data into two parts each:

    • predictor and target variables (of the same size) for training, i.e. fitting the model

    • predictor and target variables (of the same size) for testing, i.e. evaluating the performance. These observations are not used in the training/fitting process!

    We set the ratio test_size to 0.4, so that 40% of the observation are reserved for testing and thus have not been seen by the algorithm before.

  • import roc_auc_score, a metric which can be used to determine the performance of a classifier

After creating the test and training datasets (using unpacking), we instantiate the random forest object and set max_depth, a hyper parameter, i.e. a model parameter which is not optimized by the algorithm but set by us, to four.

After the usual fit and predict calls, we print the roc-auc score using the test targets and the prediction for the test data.

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score

X_rf = df[['age', 'bmi', 'children', 'charges']]
y_rf = df.smoker == 'yes'

X_train, X_test, y_train, y_test = train_test_split(X_rf, y_rf, test_size=0.4, random_state= 1)

rf = RandomForestClassifier(max_depth=4, random_state=44)
rf.fit(X_train,y_train)

y_pred = rf.predict(X_test)
print(roc_auc_score(y_test,y_pred))
0.9755105572862581

A result of about 97% shows that the model can indeed distinguish between the two classes very well, using the given predictor variables.

While decision trees are very well interpretable, by drawing, random forests lose this advantage due to the high number of trees used. We can however access the single trees, using the .estimators_ attribute after fitting the model. It is a list of the length of trees used, by default as used here 100:

print(len(rf.estimators_))
100

Now, as a last demonstration of sklearn user-friendliness, we use plot_tree to plot the first tree built by the random forest. It will create a tree diagram, showing information about the splits made at each node. Note, that our choice of max_depth is reflected by the four child generations of the tree.

from sklearn.tree import plot_tree

fig, axes = plt.subplots(1, 1, figsize=(10,6), dpi=120)
plot_tree(rf.estimators_[0],
          feature_names = X_rf.columns,
          class_names=['smoker', 'non_smoker'],
          filled = True,
          label='root')
plt.show() 
_images/Python_introduction_196_0.png

TensorFlow: Keras

TensorFlow is a machine learning and artificial intelligence platform. While including broad functionality, we will focus here on artificial neural networks (ANN). To build such networks, we will leverage the high level API Keras and show here how to use ‘the sequential model’. This is allows to create the simplest form of neural networks, a fully connected feed forward structure. This unhandy term basically refers to a stack of so called layers of neurons, where one neuron of one layer is ‘connected’ to each neuron of the subsequent layer (as we won’t go into theoretic details: original paper, Wikipedia, The Elements of Statistical Learning, chapter 11).

TensorFlow works with numpy arrays and pandas dataframes. We will now try to do a regression on some simulated data, using a neural network.

The package tensorflow must be installed. To build neural networks, we start by importing the necessary modules. Note that Keras is delivered as a submodule to TensorFlow, which we can access using the dot chaining, and that we only import layers.

A simple sequential model

from tensorflow import keras, random
import numpy as np
import matplotlib.pyplot as plt

random.set_seed(102)

We will write a function to simulate some data using numpy. We include a noise argument to be able to view the true function, which is obfuscated by the added noise.

x = np.sort(np.random.uniform(-5,5,400))

def make_data(a, b, c, _x=x, noise=True):
    if noise:
        return a*_x + b*(_x-4)**2 + c*_x**3 + np.random.normal(0,40,len(_x))
    if not noise:
        return a*_x + b*(_x-4)**2 + c*_x**3

y = make_data(-.5,2,1.5)
plt.scatter(x,y)
plt.plot(x,make_data(-.5,2,1.5, noise=False), c='red', lw=3, alpha=.5)
plt.show()
_images/Python_introduction_202_0.png

We define a very simple fully connected ANN with one hidden layer and only zwo neurons inside that layer. Since we aim to perform regression, we use one neuron in the output layer. To do so in the code, we call keras.Sequential and add as argument all layers as a list. Elements of that list are keras.layers.Dense objects, which represent fully connected layers. Note that the order by which you define this list, corresponds to the stacking of the layers. As arguments of Dense, we specify the number of neurons and the desired activation function. We may also give names to the layers, which might be ore useful for more complex net architectures.

Be aware that we do define an input shape here. Instead, we could let Keras do so automatically using the dimension of our input data. However, the specification of the input shape enables us to view the model summary without having to run anything else beforehand.

model = keras.Sequential([
        keras.layers.Dense(2, activation="relu", input_shape=(1,), name="hidden_layer"),
        keras.layers.Dense(1, name="output_layer"),
], name='my_first_model')

A summary of the defined model can be printed calling the method of the same name.

model.summary()
Model: "my_first_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
hidden_layer (Dense)         (None, 2)                 4         
_________________________________________________________________
output_layer (Dense)         (None, 1)                 3         
=================================================================
Total params: 7
Trainable params: 7
Non-trainable params: 0
_________________________________________________________________

Having defined the model, we must compile, i.e. set the final configuration. We call the .compile() method on our model and specify the optimization algorithm (and the learning rate) and the loss function, by which the model should be trained. For regression, we use the mean squared error.

The learning rate is a hyper parameter and is to be chosen to suit the problem at hand. It’s possible to find an optimal rate by random or grid search, which we will not further discuss here.

model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=5e-2), loss=keras.losses.MeanSquaredError(), )

We can now train the model, i.e. set the weights accordingly as to minimize the calculated error from deviations between model predictions and training data.

We do so by calling the .fit() method to which we supply the data and the number of episodes (iterations to tune the weights). Furthermore, we can set the batch_size, which defines the number of samples for which the weight updates are accumulated.

We can save all information returned from the fitting process in a History object, here called history. Through this object, we can access this information, once the fitting is done.

history = model.fit(x,y, epochs=150, batch_size=30)
2021-10-14 14:44:45.034169: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/150
 1/14 [=>............................] - ETA: 4s - loss: 5010.3647

14/14 [==============================] - 0s 639us/step - loss: 4742.2642
Epoch 2/150

 1/14 [=>............................] - ETA: 0s - loss: 3478.6057

14/14 [==============================] - 0s 557us/step - loss: 4200.7036
Epoch 3/150

 1/14 [=>............................] - ETA: 0s - loss: 3957.1694

14/14 [==============================] - 0s 585us/step - loss: 3634.8594
Epoch 4/150

 1/14 [=>............................] - ETA: 0s - loss: 2328.4253

14/14 [==============================] - 0s 575us/step - loss: 3123.9043
Epoch 5/150

 1/14 [=>............................] - ETA: 0s - loss: 2201.3816

14/14 [==============================] - 0s 571us/step - loss: 2902.2244
Epoch 6/150

 1/14 [=>............................] - ETA: 0s - loss: 2573.4519

14/14 [==============================] - 0s 565us/step - loss: 2865.4795
Epoch 7/150

 1/14 [=>............................] - ETA: 0s - loss: 3340.4697

14/14 [==============================] - 0s 549us/step - loss: 2840.9951
Epoch 8/150

 1/14 [=>............................] - ETA: 0s - loss: 3366.2019

14/14 [==============================] - 0s 530us/step - loss: 2826.6309
Epoch 9/150

 1/14 [=>............................] - ETA: 0s - loss: 2342.2576

14/14 [==============================] - 0s 547us/step - loss: 2796.0037
Epoch 10/150

 1/14 [=>............................] - ETA: 0s - loss: 3133.8855

14/14 [==============================] - 0s 532us/step - loss: 2794.3647
Epoch 11/150

 1/14 [=>............................] - ETA: 0s - loss: 2288.0791

14/14 [==============================] - 0s 529us/step - loss: 2764.3584
Epoch 12/150

 1/14 [=>............................] - ETA: 0s - loss: 2957.8259

14/14 [==============================] - 0s 538us/step - loss: 2735.8359
Epoch 13/150

 1/14 [=>............................] - ETA: 0s - loss: 2479.3542

14/14 [==============================] - 0s 530us/step - loss: 2725.9797
Epoch 14/150

 1/14 [=>............................] - ETA: 0s - loss: 3318.3879

14/14 [==============================] - 0s 543us/step - loss: 2693.5439
Epoch 15/150

 1/14 [=>............................] - ETA: 0s - loss: 2013.8472

14/14 [==============================] - 0s 537us/step - loss: 2679.5457
Epoch 16/150

 1/14 [=>............................] - ETA: 0s - loss: 3035.3955

14/14 [==============================] - 0s 534us/step - loss: 2646.2117
Epoch 17/150

 1/14 [=>............................] - ETA: 0s - loss: 2301.5178

14/14 [==============================] - 0s 533us/step - loss: 2632.4275
Epoch 18/150

 1/14 [=>............................] - ETA: 0s - loss: 1547.5999

14/14 [==============================] - 0s 521us/step - loss: 2603.1560
Epoch 19/150

 1/14 [=>............................] - ETA: 0s - loss: 2017.6582

14/14 [==============================] - 0s 532us/step - loss: 2580.4441
Epoch 20/150

 1/14 [=>............................] - ETA: 0s - loss: 2249.2634

14/14 [==============================] - 0s 539us/step - loss: 2553.5386
Epoch 21/150

 1/14 [=>............................] - ETA: 0s - loss: 2285.9822

14/14 [==============================] - 0s 506us/step - loss: 2525.0850
Epoch 22/150

 1/14 [=>............................] - ETA: 0s - loss: 3257.6653

14/14 [==============================] - 0s 510us/step - loss: 2494.1201
Epoch 23/150

 1/14 [=>............................] - ETA: 0s - loss: 2183.4243

14/14 [==============================] - 0s 529us/step - loss: 2475.3765
Epoch 24/150

 1/14 [=>............................] - ETA: 0s - loss: 3834.2661

14/14 [==============================] - 0s 520us/step - loss: 2450.8838
Epoch 25/150

 1/14 [=>............................] - ETA: 0s - loss: 1932.1840

14/14 [==============================] - 0s 536us/step - loss: 2428.6665
Epoch 26/150

 1/14 [=>............................] - ETA: 0s - loss: 3432.5132

14/14 [==============================] - 0s 534us/step - loss: 2388.7234
Epoch 27/150

 1/14 [=>............................] - ETA: 0s - loss: 1510.2345

14/14 [==============================] - 0s 547us/step - loss: 2369.6035
Epoch 28/150

 1/14 [=>............................] - ETA: 0s - loss: 2660.5562

14/14 [==============================] - 0s 517us/step - loss: 2340.2803
Epoch 29/150

 1/14 [=>............................] - ETA: 0s - loss: 2192.2332

14/14 [==============================] - 0s 534us/step - loss: 2318.5977
Epoch 30/150

 1/14 [=>............................] - ETA: 0s - loss: 2582.9363

14/14 [==============================] - 0s 534us/step - loss: 2287.5024
Epoch 31/150

 1/14 [=>............................] - ETA: 0s - loss: 1756.6682

14/14 [==============================] - 0s 541us/step - loss: 2268.9893
Epoch 32/150

 1/14 [=>............................] - ETA: 0s - loss: 2899.3723

14/14 [==============================] - 0s 497us/step - loss: 2240.7876
Epoch 33/150

 1/14 [=>............................] - ETA: 0s - loss: 1983.2986

14/14 [==============================] - 0s 535us/step - loss: 2211.1995
Epoch 34/150

 1/14 [=>............................] - ETA: 0s - loss: 1560.9476

14/14 [==============================] - 0s 512us/step - loss: 2192.2371
Epoch 35/150

 1/14 [=>............................] - ETA: 0s - loss: 2648.5688

14/14 [==============================] - 0s 550us/step - loss: 2162.2241
Epoch 36/150

 1/14 [=>............................] - ETA: 0s - loss: 1649.0129

14/14 [==============================] - 0s 503us/step - loss: 2139.9255
Epoch 37/150

 1/14 [=>............................] - ETA: 0s - loss: 1817.4573

14/14 [==============================] - 0s 542us/step - loss: 2124.0259
Epoch 38/150

 1/14 [=>............................] - ETA: 0s - loss: 1631.2590

14/14 [==============================] - 0s 509us/step - loss: 2105.6897
Epoch 39/150

 1/14 [=>............................] - ETA: 0s - loss: 1871.9985

14/14 [==============================] - 0s 540us/step - loss: 2082.3203
Epoch 40/150

 1/14 [=>............................] - ETA: 0s - loss: 1935.5122

14/14 [==============================] - 0s 540us/step - loss: 2072.4873
Epoch 41/150

 1/14 [=>............................] - ETA: 0s - loss: 2020.0757

14/14 [==============================] - 0s 542us/step - loss: 2044.9666
Epoch 42/150

 1/14 [=>............................] - ETA: 0s - loss: 1223.7589

14/14 [==============================] - 0s 530us/step - loss: 2028.8408
Epoch 43/150

 1/14 [=>............................] - ETA: 0s - loss: 2129.6023

14/14 [==============================] - 0s 530us/step - loss: 2003.6714
Epoch 44/150

 1/14 [=>............................] - ETA: 0s - loss: 1190.1484

14/14 [==============================] - 0s 523us/step - loss: 1986.3053
Epoch 45/150

 1/14 [=>............................] - ETA: 0s - loss: 2066.7163

14/14 [==============================] - 0s 539us/step - loss: 1934.5011
Epoch 46/150

 1/14 [=>............................] - ETA: 0s - loss: 1868.9152

14/14 [==============================] - 0s 497us/step - loss: 1892.0114
Epoch 47/150

 1/14 [=>............................] - ETA: 0s - loss: 1447.6207

14/14 [==============================] - 0s 530us/step - loss: 1867.8999
Epoch 48/150

 1/14 [=>............................] - ETA: 0s - loss: 1975.7592

14/14 [==============================] - 0s 531us/step - loss: 1858.8761
Epoch 49/150

 1/14 [=>............................] - ETA: 0s - loss: 1394.4054

14/14 [==============================] - 0s 503us/step - loss: 1847.4827
Epoch 50/150

 1/14 [=>............................] - ETA: 0s - loss: 1302.9750

14/14 [==============================] - 0s 541us/step - loss: 1842.5806
Epoch 51/150

 1/14 [=>............................] - ETA: 0s - loss: 1138.3920

14/14 [==============================] - 0s 547us/step - loss: 1837.9988
Epoch 52/150

 1/14 [=>............................] - ETA: 0s - loss: 1699.5891

14/14 [==============================] - 0s 522us/step - loss: 1827.4926
Epoch 53/150

 1/14 [=>............................] - ETA: 0s - loss: 2277.6965

14/14 [==============================] - 0s 521us/step - loss: 1822.0444
Epoch 54/150

 1/14 [=>............................] - ETA: 0s - loss: 2745.1299

14/14 [==============================] - 0s 516us/step - loss: 1817.8193
Epoch 55/150
 1/14 [=>............................] - ETA: 0s - loss: 1284.6362

14/14 [==============================] - 0s 540us/step - loss: 1814.4285
Epoch 56/150
 1/14 [=>............................] - ETA: 0s - loss: 1762.6451

14/14 [==============================] - 0s 528us/step - loss: 1811.5752
Epoch 57/150

 1/14 [=>............................] - ETA: 0s - loss: 1717.3512

14/14 [==============================] - 0s 523us/step - loss: 1812.8103
Epoch 58/150

 1/14 [=>............................] - ETA: 0s - loss: 1232.3264

14/14 [==============================] - 0s 514us/step - loss: 1810.4524
Epoch 59/150

 1/14 [=>............................] - ETA: 0s - loss: 849.2548

14/14 [==============================] - 0s 537us/step - loss: 1812.3059
Epoch 60/150
 1/14 [=>............................] - ETA: 0s - loss: 1586.6312

14/14 [==============================] - 0s 525us/step - loss: 1809.0492
Epoch 61/150

 1/14 [=>............................] - ETA: 0s - loss: 1266.2285

14/14 [==============================] - 0s 523us/step - loss: 1792.9617
Epoch 62/150
 1/14 [=>............................] - ETA: 0s - loss: 1824.1169

14/14 [==============================] - 0s 490us/step - loss: 1799.1234
Epoch 63/150
 1/14 [=>............................] - ETA: 0s - loss: 2083.2261

14/14 [==============================] - 0s 525us/step - loss: 1794.3662
Epoch 64/150
 1/14 [=>............................] - ETA: 0s - loss: 1708.8097

14/14 [==============================] - 0s 547us/step - loss: 1792.9855
Epoch 65/150
 1/14 [=>............................] - ETA: 0s - loss: 1743.0035

14/14 [==============================] - 0s 573us/step - loss: 1792.1656
Epoch 66/150
 1/14 [=>............................] - ETA: 0s - loss: 1566.4789

14/14 [==============================] - 0s 522us/step - loss: 1782.7203
Epoch 67/150
 1/14 [=>............................] - ETA: 0s - loss: 1854.9982

14/14 [==============================] - 0s 516us/step - loss: 1789.1030
Epoch 68/150
 1/14 [=>............................] - ETA: 0s - loss: 1978.6782

14/14 [==============================] - 0s 521us/step - loss: 1782.2421
Epoch 69/150
 1/14 [=>............................] - ETA: 0s - loss: 1468.1119

14/14 [==============================] - 0s 536us/step - loss: 1780.6017
Epoch 70/150
 1/14 [=>............................] - ETA: 0s - loss: 2747.3269

14/14 [==============================] - 0s 528us/step - loss: 1791.7169
Epoch 71/150

 1/14 [=>............................] - ETA: 0s - loss: 1871.4335

14/14 [==============================] - 0s 542us/step - loss: 1772.8536
Epoch 72/150
 1/14 [=>............................] - ETA: 0s - loss: 2068.0422

14/14 [==============================] - 0s 547us/step - loss: 1773.6754
Epoch 73/150

 1/14 [=>............................] - ETA: 0s - loss: 1274.3781

14/14 [==============================] - 0s 526us/step - loss: 1780.6837
Epoch 74/150

 1/14 [=>............................] - ETA: 0s - loss: 1768.8903

14/14 [==============================] - 0s 527us/step - loss: 1774.9236
Epoch 75/150

 1/14 [=>............................] - ETA: 0s - loss: 2023.4924

14/14 [==============================] - 0s 524us/step - loss: 1771.0848
Epoch 76/150
 1/14 [=>............................] - ETA: 0s - loss: 1131.5242

14/14 [==============================] - 0s 530us/step - loss: 1775.7861
Epoch 77/150
 1/14 [=>............................] - ETA: 0s - loss: 2282.4072

14/14 [==============================] - 0s 533us/step - loss: 1774.9368
Epoch 78/150

 1/14 [=>............................] - ETA: 0s - loss: 1735.2528

14/14 [==============================] - 0s 547us/step - loss: 1768.2455
Epoch 79/150

 1/14 [=>............................] - ETA: 0s - loss: 1938.5796

14/14 [==============================] - 0s 507us/step - loss: 1772.3743
Epoch 80/150

 1/14 [=>............................] - ETA: 0s - loss: 1436.4734

14/14 [==============================] - 0s 529us/step - loss: 1762.6750
Epoch 81/150

 1/14 [=>............................] - ETA: 0s - loss: 1367.4943

14/14 [==============================] - 0s 507us/step - loss: 1769.2416
Epoch 82/150
 1/14 [=>............................] - ETA: 0s - loss: 1866.8767

14/14 [==============================] - 0s 524us/step - loss: 1772.7253
Epoch 83/150
 1/14 [=>............................] - ETA: 0s - loss: 1314.1969

14/14 [==============================] - 0s 520us/step - loss: 1766.6689
Epoch 84/150
 1/14 [=>............................] - ETA: 0s - loss: 2551.4849

14/14 [==============================] - 0s 536us/step - loss: 1766.5470
Epoch 85/150
 1/14 [=>............................] - ETA: 0s - loss: 1614.4226

14/14 [==============================] - 0s 517us/step - loss: 1759.7759
Epoch 86/150
 1/14 [=>............................] - ETA: 0s - loss: 1742.9042

14/14 [==============================] - 0s 521us/step - loss: 1763.0273
Epoch 87/150

 1/14 [=>............................] - ETA: 0s - loss: 1591.7930

14/14 [==============================] - 0s 513us/step - loss: 1768.7415
Epoch 88/150

 1/14 [=>............................] - ETA: 0s - loss: 1344.9821

14/14 [==============================] - 0s 539us/step - loss: 1756.1138
Epoch 89/150
 1/14 [=>............................] - ETA: 0s - loss: 1426.9155

14/14 [==============================] - 0s 537us/step - loss: 1756.6399
Epoch 90/150
 1/14 [=>............................] - ETA: 0s - loss: 2413.8657

14/14 [==============================] - 0s 517us/step - loss: 1765.2579
Epoch 91/150
 1/14 [=>............................] - ETA: 0s - loss: 2540.1128

14/14 [==============================] - 0s 514us/step - loss: 1750.4875
Epoch 92/150
 1/14 [=>............................] - ETA: 0s - loss: 1414.7816

14/14 [==============================] - 0s 540us/step - loss: 1755.2792
Epoch 93/150

 1/14 [=>............................] - ETA: 0s - loss: 1793.2633

14/14 [==============================] - 0s 516us/step - loss: 1759.1691
Epoch 94/150
 1/14 [=>............................] - ETA: 0s - loss: 1806.3442

14/14 [==============================] - 0s 553us/step - loss: 1756.4376
Epoch 95/150

 1/14 [=>............................] - ETA: 0s - loss: 1695.4728

14/14 [==============================] - 0s 523us/step - loss: 1753.3845
Epoch 96/150
 1/14 [=>............................] - ETA: 0s - loss: 1053.2019

14/14 [==============================] - 0s 552us/step - loss: 1752.7219
Epoch 97/150
 1/14 [=>............................] - ETA: 0s - loss: 1632.4253

14/14 [==============================] - 0s 533us/step - loss: 1756.0637
Epoch 98/150

 1/14 [=>............................] - ETA: 0s - loss: 1849.0173

14/14 [==============================] - 0s 512us/step - loss: 1752.5233
Epoch 99/150
 1/14 [=>............................] - ETA: 0s - loss: 1702.6283

14/14 [==============================] - 0s 518us/step - loss: 1755.0476
Epoch 100/150
 1/14 [=>............................] - ETA: 0s - loss: 1905.5704

14/14 [==============================] - 0s 532us/step - loss: 1750.4152
Epoch 101/150
 1/14 [=>............................] - ETA: 0s - loss: 1461.6461

14/14 [==============================] - 0s 525us/step - loss: 1751.2612
Epoch 102/150
 1/14 [=>............................] - ETA: 0s - loss: 1126.1677

14/14 [==============================] - 0s 526us/step - loss: 1759.3806
Epoch 103/150
 1/14 [=>............................] - ETA: 0s - loss: 1839.9249

14/14 [==============================] - 0s 528us/step - loss: 1748.4285
Epoch 104/150
 1/14 [=>............................] - ETA: 0s - loss: 1498.0276

14/14 [==============================] - 0s 538us/step - loss: 1759.5814
Epoch 105/150
 1/14 [=>............................] - ETA: 0s - loss: 1440.9259

14/14 [==============================] - 0s 526us/step - loss: 1755.2061
Epoch 106/150
 1/14 [=>............................] - ETA: 0s - loss: 1678.0834

14/14 [==============================] - 0s 522us/step - loss: 1744.6653
Epoch 107/150
 1/14 [=>............................] - ETA: 0s - loss: 1353.7863

14/14 [==============================] - 0s 519us/step - loss: 1746.8293
Epoch 108/150
 1/14 [=>............................] - ETA: 0s - loss: 1849.8147

14/14 [==============================] - 0s 538us/step - loss: 1745.0765
Epoch 109/150
 1/14 [=>............................] - ETA: 0s - loss: 2396.7615

14/14 [==============================] - 0s 531us/step - loss: 1755.8143
Epoch 110/150
 1/14 [=>............................] - ETA: 0s - loss: 1312.3553

14/14 [==============================] - 0s 525us/step - loss: 1740.8914
Epoch 111/150

 1/14 [=>............................] - ETA: 0s - loss: 1122.2638

14/14 [==============================] - 0s 514us/step - loss: 1763.9812
Epoch 112/150

 1/14 [=>............................] - ETA: 0s - loss: 1584.1172

14/14 [==============================] - 0s 494us/step - loss: 1746.9639
Epoch 113/150
 1/14 [=>............................] - ETA: 0s - loss: 1618.6571

14/14 [==============================] - 0s 531us/step - loss: 1752.3551
Epoch 114/150
 1/14 [=>............................] - ETA: 0s - loss: 1569.0101

14/14 [==============================] - 0s 514us/step - loss: 1755.3804
Epoch 115/150
 1/14 [=>............................] - ETA: 0s - loss: 1278.5673

14/14 [==============================] - 0s 564us/step - loss: 1747.6730
Epoch 116/150
 1/14 [=>............................] - ETA: 0s - loss: 1738.4985

14/14 [==============================] - 0s 536us/step - loss: 1752.4052
Epoch 117/150
 1/14 [=>............................] - ETA: 0s - loss: 1901.7031

14/14 [==============================] - 0s 536us/step - loss: 1744.2675
Epoch 118/150

 1/14 [=>............................] - ETA: 0s - loss: 1631.7396

14/14 [==============================] - 0s 516us/step - loss: 1747.9536
Epoch 119/150
 1/14 [=>............................] - ETA: 0s - loss: 1719.2880

14/14 [==============================] - 0s 522us/step - loss: 1747.9197
Epoch 120/150
 1/14 [=>............................] - ETA: 0s - loss: 2189.0654

14/14 [==============================] - 0s 528us/step - loss: 1744.6261
Epoch 121/150
 1/14 [=>............................] - ETA: 0s - loss: 1614.3927

14/14 [==============================] - 0s 534us/step - loss: 1759.8953
Epoch 122/150

 1/14 [=>............................] - ETA: 0s - loss: 1640.8679

14/14 [==============================] - 0s 524us/step - loss: 1743.0670
Epoch 123/150
 1/14 [=>............................] - ETA: 0s - loss: 1691.7711

14/14 [==============================] - 0s 525us/step - loss: 1751.1255
Epoch 124/150
 1/14 [=>............................] - ETA: 0s - loss: 1608.9858

14/14 [==============================] - 0s 524us/step - loss: 1756.9698
Epoch 125/150
 1/14 [=>............................] - ETA: 0s - loss: 1731.0620

14/14 [==============================] - 0s 510us/step - loss: 1752.1606
Epoch 126/150
 1/14 [=>............................] - ETA: 0s - loss: 2051.9629

14/14 [==============================] - 0s 485us/step - loss: 1747.1899
Epoch 127/150
 1/14 [=>............................] - ETA: 0s - loss: 1121.8776

14/14 [==============================] - 0s 538us/step - loss: 1750.7338
Epoch 128/150
 1/14 [=>............................] - ETA: 0s - loss: 2008.6641

14/14 [==============================] - 0s 530us/step - loss: 1742.1851
Epoch 129/150
 1/14 [=>............................] - ETA: 0s - loss: 2147.2957

14/14 [==============================] - 0s 506us/step - loss: 1753.5629
Epoch 130/150
 1/14 [=>............................] - ETA: 0s - loss: 1480.4437

14/14 [==============================] - 0s 513us/step - loss: 1744.2214
Epoch 131/150
 1/14 [=>............................] - ETA: 0s - loss: 1522.9178

14/14 [==============================] - 0s 533us/step - loss: 1747.1217
Epoch 132/150
 1/14 [=>............................] - ETA: 0s - loss: 1954.7677

14/14 [==============================] - 0s 507us/step - loss: 1753.1567
Epoch 133/150
 1/14 [=>............................] - ETA: 0s - loss: 1652.4036

14/14 [==============================] - 0s 526us/step - loss: 1749.5742
Epoch 134/150
 1/14 [=>............................] - ETA: 0s - loss: 2005.7797

14/14 [==============================] - 0s 511us/step - loss: 1748.2629
Epoch 135/150
 1/14 [=>............................] - ETA: 0s - loss: 1716.4358

14/14 [==============================] - 0s 527us/step - loss: 1752.4061
Epoch 136/150
 1/14 [=>............................] - ETA: 0s - loss: 1258.8134

14/14 [==============================] - 0s 541us/step - loss: 1743.8080
Epoch 137/150
 1/14 [=>............................] - ETA: 0s - loss: 1677.9402

14/14 [==============================] - 0s 547us/step - loss: 1748.7053
Epoch 138/150
 1/14 [=>............................] - ETA: 0s - loss: 1324.9655

14/14 [==============================] - 0s 540us/step - loss: 1744.1981
Epoch 139/150
 1/14 [=>............................] - ETA: 0s - loss: 1731.7369

14/14 [==============================] - 0s 522us/step - loss: 1755.6050
Epoch 140/150
 1/14 [=>............................] - ETA: 0s - loss: 1772.4386

14/14 [==============================] - 0s 528us/step - loss: 1746.6133
Epoch 141/150

 1/14 [=>............................] - ETA: 0s - loss: 2252.7429

14/14 [==============================] - 0s 512us/step - loss: 1756.0858
Epoch 142/150
 1/14 [=>............................] - ETA: 0s - loss: 1322.3750

14/14 [==============================] - 0s 536us/step - loss: 1744.0129
Epoch 143/150
 1/14 [=>............................] - ETA: 0s - loss: 1557.6727

14/14 [==============================] - 0s 497us/step - loss: 1755.4741
Epoch 144/150
 1/14 [=>............................] - ETA: 0s - loss: 1575.0686

14/14 [==============================] - 0s 531us/step - loss: 1747.7572
Epoch 145/150
 1/14 [=>............................] - ETA: 0s - loss: 1278.3854

14/14 [==============================] - 0s 556us/step - loss: 1747.4644
Epoch 146/150
 1/14 [=>............................] - ETA: 0s - loss: 1669.6940

14/14 [==============================] - 0s 526us/step - loss: 1739.2986
Epoch 147/150
 1/14 [=>............................] - ETA: 0s - loss: 2270.0437

14/14 [==============================] - 0s 511us/step - loss: 1745.2605
Epoch 148/150
 1/14 [=>............................] - ETA: 0s - loss: 2021.0011

14/14 [==============================] - 0s 515us/step - loss: 1751.0267
Epoch 149/150
 1/14 [=>............................] - ETA: 0s - loss: 1506.5541

14/14 [==============================] - 0s 506us/step - loss: 1741.6172
Epoch 150/150
 1/14 [=>............................] - ETA: 0s - loss: 1894.3766

14/14 [==============================] - 0s 490us/step - loss: 1735.1869

We get a printed output for each episode (click the plus symbol to show), showing some information about the step. Below the ‘Epochs’, we see that a batch size of 30 at a 300 observations results in 10 (300 divided by 30) updates per epoch.

Of more interest is the rightmost piece of information, the loss. As the weights are adjusted, the model predictions fit the data progressively better - the loss decreases. At some point, the model in its current configuration reaches an optimal weight setting and the loss will oscillate around some value.

Note that by setting verbose=0, we can suppress any output.

Using history we can now plot the loss over time/epochs. To do so, we use “our” history object and access its .history attribute, which is a dict:

print(type(history.history))
print(history.history.keys())
<class 'dict'>
dict_keys(['loss'])

Note that more values may be added to this dict. We will see this case for the next model.

With matplotlib, we plot the loss over time.

plt.plot(history.history['loss'])
[<matplotlib.lines.Line2D at 0x7ff6d35f2a90>]
_images/Python_introduction_214_1.png

Apparently, the loss dropped very quickly in the beginning and reaches a minimal level after about 75 epochs.

Beside the history object, we can take a look at the trained weights by calling model.get_weights().

for layer in model.layers:
    print(f"{layer.name}: weights {layer.get_weights()}")
hidden_layer: weights [array([[ 4.204577  , -0.37333143]], dtype=float32), array([-12.313049,   5.3113  ], dtype=float32)]
output_layer: weights [array([[18.764547],
       [ 2.000779]], dtype=float32), array([25.884768], dtype=float32)]

For each layer, we see the number of entries corresponding to ‘params’ in the model summary above. The first array per layer gives the weights, the second array the bias.

Note here, that it is perfectly possible to save the weights of a model, e.g. in a file, and then load this model again later by setting these weights to the model weights. Of course, the model configuration must remain the same in order to use the weights. However, the already implemented .save('my_path') method does exactly that. To load the model, use .model.load('my_path').

To see the predictions from our model, we use the .predict() method, to which we only pass the x values, and store the result in a variable.

y_pred = model.predict(x)

Finally, we can get a look on how this simple neural network approximates our data:

plt.scatter(x, y)
plt.plot(x,y_pred.reshape(-1), color='orange', lw=4)
plt.show()
_images/Python_introduction_220_0.png

Interestingly, the ReLU shape of the function can be seen rather clearly.

Now, let’s try to do better than that.

A more complex model

We will now test if a more flexible, more complex model with more layers and neurons, will approximate the data more accurately

First, we define a this new model as seen above.

random.set_seed(11)

better_model = keras.Sequential([
        keras.layers.Dense(22, activation="relu", name="l1"),
        keras.layers.Dense(11, activation="relu", name="l2"),
        keras.layers.Dense(1, name="lo"),
])

We will try a smaller learning rate (and may thus increase the number of epochs).

better_model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=6e-3), loss=keras.losses.MeanSquaredError())

Usually, when training a model, we split the data into a training and test set. We will do this using sklearn’s train_test_split. Conveniently, in Keras, we can pass the validation data using the validation_data argument of .fit().

Another possibility is to use validation_split, which will, when set for example to .2, omit 20% of the training data and evaluate the performance for this “unknown” part of the data. Beware, that this will only work for shuffled data, as the validation data is taken from the last entries of the data.

from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.15, random_state= 22)
better_history = better_model.fit(x_train,y_train, epochs=200, validation_data=(x_test,y_test))
Epoch 1/200
 1/11 [=>............................] - ETA: 2s - loss: 5379.2290

11/11 [==============================] - 0s 10ms/step - loss: 4646.1631 - val_loss: 5256.3979
Epoch 2/200

 1/11 [=>............................] - ETA: 0s - loss: 3615.7610

11/11 [==============================] - 0s 2ms/step - loss: 4273.9917 - val_loss: 4757.6499
Epoch 3/200

 1/11 [=>............................] - ETA: 0s - loss: 5245.0601

11/11 [==============================] - 0s 2ms/step - loss: 3813.5903 - val_loss: 4152.2627
Epoch 4/200

 1/11 [=>............................] - ETA: 0s - loss: 3670.8752

11/11 [==============================] - 0s 2ms/step - loss: 3334.1462 - val_loss: 3508.7620
Epoch 5/200

 1/11 [=>............................] - ETA: 0s - loss: 3520.1980

11/11 [==============================] - 0s 2ms/step - loss: 2931.9888 - val_loss: 2969.8188
Epoch 6/200

 1/11 [=>............................] - ETA: 0s - loss: 2764.7090

11/11 [==============================] - 0s 2ms/step - loss: 2651.9497 - val_loss: 2598.4275
Epoch 7/200

 1/11 [=>............................] - ETA: 0s - loss: 1763.4617

11/11 [==============================] - 0s 2ms/step - loss: 2491.9717 - val_loss: 2379.4175
Epoch 8/200

 1/11 [=>............................] - ETA: 0s - loss: 2950.0029

11/11 [==============================] - 0s 2ms/step - loss: 2415.0598 - val_loss: 2262.8076
Epoch 9/200

 1/11 [=>............................] - ETA: 0s - loss: 2533.9082

11/11 [==============================] - 0s 2ms/step - loss: 2391.1099 - val_loss: 2221.0466
Epoch 10/200

 1/11 [=>............................] - ETA: 0s - loss: 1818.8695

11/11 [==============================] - 0s 2ms/step - loss: 2374.8582 - val_loss: 2204.2803
Epoch 11/200

 1/11 [=>............................] - ETA: 0s - loss: 2937.5747

11/11 [==============================] - 0s 2ms/step - loss: 2367.9121 - val_loss: 2202.0244
Epoch 12/200

 1/11 [=>............................] - ETA: 0s - loss: 1979.3983

11/11 [==============================] - 0s 2ms/step - loss: 2351.2268 - val_loss: 2175.4490
Epoch 13/200

 1/11 [=>............................] - ETA: 0s - loss: 2720.0454

11/11 [==============================] - 0s 2ms/step - loss: 2349.3333 - val_loss: 2174.7839
Epoch 14/200

 1/11 [=>............................] - ETA: 0s - loss: 1459.7668

11/11 [==============================] - 0s 2ms/step - loss: 2332.6279 - val_loss: 2182.9712
Epoch 15/200

 1/11 [=>............................] - ETA: 0s - loss: 3553.6650

11/11 [==============================] - 0s 2ms/step - loss: 2330.9302 - val_loss: 2156.9771
Epoch 16/200

 1/11 [=>............................] - ETA: 0s - loss: 1695.4878

11/11 [==============================] - 0s 2ms/step - loss: 2321.6135 - val_loss: 2129.0205
Epoch 17/200

 1/11 [=>............................] - ETA: 0s - loss: 3063.1587

11/11 [==============================] - 0s 2ms/step - loss: 2312.9658 - val_loss: 2133.5271
Epoch 18/200

 1/11 [=>............................] - ETA: 0s - loss: 2084.2268

11/11 [==============================] - 0s 2ms/step - loss: 2306.7004 - val_loss: 2128.6851
Epoch 19/200
 1/11 [=>............................] - ETA: 0s - loss: 2326.1873

11/11 [==============================] - 0s 2ms/step - loss: 2296.9937 - val_loss: 2146.7803
Epoch 20/200

 1/11 [=>............................] - ETA: 0s - loss: 1695.8362

11/11 [==============================] - 0s 2ms/step - loss: 2295.0491 - val_loss: 2113.5364
Epoch 21/200

 1/11 [=>............................] - ETA: 0s - loss: 2040.2994

11/11 [==============================] - 0s 2ms/step - loss: 2280.6655 - val_loss: 2136.1924
Epoch 22/200

 1/11 [=>............................] - ETA: 0s - loss: 1297.1206

11/11 [==============================] - 0s 2ms/step - loss: 2269.5559 - val_loss: 2131.8782
Epoch 23/200

 1/11 [=>............................] - ETA: 0s - loss: 2620.3892

11/11 [==============================] - 0s 2ms/step - loss: 2275.0940 - val_loss: 2111.5942
Epoch 24/200

 1/11 [=>............................] - ETA: 0s - loss: 2333.5916

11/11 [==============================] - 0s 2ms/step - loss: 2252.6863 - val_loss: 2080.2817
Epoch 25/200

 1/11 [=>............................] - ETA: 0s - loss: 2294.4177

11/11 [==============================] - 0s 2ms/step - loss: 2251.8567 - val_loss: 2096.0608
Epoch 26/200

 1/11 [=>............................] - ETA: 0s - loss: 2269.4683

11/11 [==============================] - 0s 2ms/step - loss: 2235.2434 - val_loss: 2055.3718
Epoch 27/200

 1/11 [=>............................] - ETA: 0s - loss: 1740.0404

11/11 [==============================] - 0s 2ms/step - loss: 2235.5645 - val_loss: 2063.9956
Epoch 28/200

 1/11 [=>............................] - ETA: 0s - loss: 2947.8662

11/11 [==============================] - 0s 2ms/step - loss: 2223.0786 - val_loss: 2069.1978
Epoch 29/200

 1/11 [=>............................] - ETA: 0s - loss: 1620.7371

11/11 [==============================] - 0s 2ms/step - loss: 2206.8840 - val_loss: 2015.4432
Epoch 30/200

 1/11 [=>............................] - ETA: 0s - loss: 2382.4658

11/11 [==============================] - 0s 2ms/step - loss: 2202.3679 - val_loss: 2035.0610
Epoch 31/200

 1/11 [=>............................] - ETA: 0s - loss: 2779.5132

11/11 [==============================] - 0s 2ms/step - loss: 2189.7041 - val_loss: 2003.1636
Epoch 32/200

 1/11 [=>............................] - ETA: 0s - loss: 2202.2239

11/11 [==============================] - 0s 2ms/step - loss: 2180.3369 - val_loss: 2033.6901
Epoch 33/200

 1/11 [=>............................] - ETA: 0s - loss: 1719.3486

11/11 [==============================] - 0s 2ms/step - loss: 2168.6492 - val_loss: 1981.4636
Epoch 34/200

 1/11 [=>............................] - ETA: 0s - loss: 2118.8374

11/11 [==============================] - 0s 2ms/step - loss: 2161.2832 - val_loss: 2001.0590
Epoch 35/200

 1/11 [=>............................] - ETA: 0s - loss: 2079.6528

11/11 [==============================] - 0s 2ms/step - loss: 2142.0813 - val_loss: 2011.2629
Epoch 36/200

 1/11 [=>............................] - ETA: 0s - loss: 2208.3474

11/11 [==============================] - 0s 2ms/step - loss: 2141.7466 - val_loss: 1962.5966
Epoch 37/200

 1/11 [=>............................] - ETA: 0s - loss: 1414.7307

11/11 [==============================] - 0s 2ms/step - loss: 2124.1016 - val_loss: 1994.9725
Epoch 38/200

 1/11 [=>............................] - ETA: 0s - loss: 2553.1382

11/11 [==============================] - 0s 2ms/step - loss: 2117.0420 - val_loss: 1976.4979
Epoch 39/200

 1/11 [=>............................] - ETA: 0s - loss: 2788.2925

11/11 [==============================] - 0s 2ms/step - loss: 2111.8640 - val_loss: 1928.1489
Epoch 40/200

 1/11 [=>............................] - ETA: 0s - loss: 2017.5078

11/11 [==============================] - 0s 2ms/step - loss: 2080.4500 - val_loss: 1957.3118
Epoch 41/200

 1/11 [=>............................] - ETA: 0s - loss: 2397.5405

11/11 [==============================] - 0s 2ms/step - loss: 2083.7253 - val_loss: 1948.0140
Epoch 42/200

 1/11 [=>............................] - ETA: 0s - loss: 2144.9746

11/11 [==============================] - 0s 2ms/step - loss: 2071.9089 - val_loss: 1900.9177
Epoch 43/200

 1/11 [=>............................] - ETA: 0s - loss: 2508.3015

11/11 [==============================] - 0s 2ms/step - loss: 2058.7864 - val_loss: 1858.5336
Epoch 44/200

 1/11 [=>............................] - ETA: 0s - loss: 1994.0857

11/11 [==============================] - 0s 2ms/step - loss: 2050.4087 - val_loss: 1860.4274
Epoch 45/200

 1/11 [=>............................] - ETA: 0s - loss: 1633.5942

11/11 [==============================] - 0s 2ms/step - loss: 2026.0919 - val_loss: 1839.5544
Epoch 46/200

 1/11 [=>............................] - ETA: 0s - loss: 1659.8230

11/11 [==============================] - 0s 2ms/step - loss: 2015.5809 - val_loss: 1820.4456
Epoch 47/200

 1/11 [=>............................] - ETA: 0s - loss: 1772.4675

11/11 [==============================] - 0s 2ms/step - loss: 2010.3394 - val_loss: 1820.1652
Epoch 48/200

 1/11 [=>............................] - ETA: 0s - loss: 2704.4893

11/11 [==============================] - 0s 2ms/step - loss: 1980.1250 - val_loss: 1810.7438
Epoch 49/200

 1/11 [=>............................] - ETA: 0s - loss: 2124.8809

11/11 [==============================] - 0s 2ms/step - loss: 1965.6436 - val_loss: 1814.1412
Epoch 50/200

 1/11 [=>............................] - ETA: 0s - loss: 2688.4453

11/11 [==============================] - 0s 2ms/step - loss: 1968.9796 - val_loss: 1749.1090
Epoch 51/200

 1/11 [=>............................] - ETA: 0s - loss: 2661.5435

11/11 [==============================] - 0s 2ms/step - loss: 1935.8853 - val_loss: 1766.6344
Epoch 52/200

 1/11 [=>............................] - ETA: 0s - loss: 1418.3444

11/11 [==============================] - 0s 2ms/step - loss: 1925.5233 - val_loss: 1735.6653
Epoch 53/200

 1/11 [=>............................] - ETA: 0s - loss: 1571.3942

11/11 [==============================] - 0s 2ms/step - loss: 1913.7307 - val_loss: 1701.6190
Epoch 54/200

 1/11 [=>............................] - ETA: 0s - loss: 1540.0582

11/11 [==============================] - 0s 2ms/step - loss: 1901.0232 - val_loss: 1681.3737
Epoch 55/200

 1/11 [=>............................] - ETA: 0s - loss: 1251.2888

11/11 [==============================] - 0s 2ms/step - loss: 1883.6573 - val_loss: 1677.2603
Epoch 56/200

 1/11 [=>............................] - ETA: 0s - loss: 2494.9839

11/11 [==============================] - 0s 2ms/step - loss: 1874.3329 - val_loss: 1647.8695
Epoch 57/200

 1/11 [=>............................] - ETA: 0s - loss: 1425.1746

11/11 [==============================] - 0s 2ms/step - loss: 1852.4578 - val_loss: 1621.3043
Epoch 58/200
 1/11 [=>............................] - ETA: 0s - loss: 1883.9199

11/11 [==============================] - 0s 2ms/step - loss: 1833.3553 - val_loss: 1599.2635
Epoch 59/200
 1/11 [=>............................] - ETA: 0s - loss: 2897.6475

11/11 [==============================] - 0s 2ms/step - loss: 1825.3704 - val_loss: 1575.8523
Epoch 60/200
 1/11 [=>............................] - ETA: 0s - loss: 1588.8239

11/11 [==============================] - 0s 2ms/step - loss: 1813.3271 - val_loss: 1573.2814
Epoch 61/200
 1/11 [=>............................] - ETA: 0s - loss: 2166.8374

11/11 [==============================] - 0s 2ms/step - loss: 1791.7620 - val_loss: 1604.3376
Epoch 62/200
 1/11 [=>............................] - ETA: 0s - loss: 1977.0415

11/11 [==============================] - 0s 2ms/step - loss: 1772.8213 - val_loss: 1557.7491
Epoch 63/200

 1/11 [=>............................] - ETA: 0s - loss: 1729.5200

11/11 [==============================] - 0s 2ms/step - loss: 1756.9026 - val_loss: 1522.2135
Epoch 64/200

 1/11 [=>............................] - ETA: 0s - loss: 1352.7811

11/11 [==============================] - 0s 2ms/step - loss: 1762.8682 - val_loss: 1548.4703
Epoch 65/200

 1/11 [=>............................] - ETA: 0s - loss: 1787.2595

11/11 [==============================] - 0s 2ms/step - loss: 1732.2941 - val_loss: 1515.3466
Epoch 66/200

 1/11 [=>............................] - ETA: 0s - loss: 1973.1813

11/11 [==============================] - 0s 2ms/step - loss: 1726.4368 - val_loss: 1493.6531
Epoch 67/200

 1/11 [=>............................] - ETA: 0s - loss: 1806.8514

11/11 [==============================] - 0s 2ms/step - loss: 1704.4601 - val_loss: 1517.0165
Epoch 68/200

 1/11 [=>............................] - ETA: 0s - loss: 1635.8613

11/11 [==============================] - 0s 2ms/step - loss: 1712.2659 - val_loss: 1492.4701
Epoch 69/200

 1/11 [=>............................] - ETA: 0s - loss: 1335.4172

11/11 [==============================] - 0s 2ms/step - loss: 1686.5120 - val_loss: 1521.7751
Epoch 70/200

 1/11 [=>............................] - ETA: 0s - loss: 2750.2039

11/11 [==============================] - 0s 2ms/step - loss: 1687.1046 - val_loss: 1466.7640
Epoch 71/200

 1/11 [=>............................] - ETA: 0s - loss: 1109.9381

11/11 [==============================] - 0s 2ms/step - loss: 1688.1476 - val_loss: 1462.3718
Epoch 72/200

 1/11 [=>............................] - ETA: 0s - loss: 1334.8746

11/11 [==============================] - 0s 2ms/step - loss: 1664.2529 - val_loss: 1505.7876
Epoch 73/200

 1/11 [=>............................] - ETA: 0s - loss: 1762.6948

11/11 [==============================] - 0s 2ms/step - loss: 1650.7289 - val_loss: 1456.6323
Epoch 74/200

 1/11 [=>............................] - ETA: 0s - loss: 1437.6868

11/11 [==============================] - 0s 2ms/step - loss: 1646.3547 - val_loss: 1447.0348
Epoch 75/200

 1/11 [=>............................] - ETA: 0s - loss: 1715.9808

11/11 [==============================] - 0s 2ms/step - loss: 1637.7841 - val_loss: 1468.8782
Epoch 76/200

 1/11 [=>............................] - ETA: 0s - loss: 1051.4340

11/11 [==============================] - 0s 2ms/step - loss: 1631.5430 - val_loss: 1471.7776
Epoch 77/200

 1/11 [=>............................] - ETA: 0s - loss: 1664.1055

11/11 [==============================] - 0s 2ms/step - loss: 1631.4528 - val_loss: 1409.7552
Epoch 78/200

 1/11 [=>............................] - ETA: 0s - loss: 1376.5328

11/11 [==============================] - 0s 2ms/step - loss: 1619.0588 - val_loss: 1502.8654
Epoch 79/200

 1/11 [=>............................] - ETA: 0s - loss: 2354.2778

11/11 [==============================] - 0s 2ms/step - loss: 1604.7134 - val_loss: 1483.6473
Epoch 80/200

 1/11 [=>............................] - ETA: 0s - loss: 1682.2913

11/11 [==============================] - 0s 2ms/step - loss: 1604.3248 - val_loss: 1417.8088
Epoch 81/200

 1/11 [=>............................] - ETA: 0s - loss: 878.5250

11/11 [==============================] - 0s 2ms/step - loss: 1589.5408 - val_loss: 1466.5648
Epoch 82/200

 1/11 [=>............................] - ETA: 0s - loss: 1557.4442

11/11 [==============================] - 0s 2ms/step - loss: 1600.3092 - val_loss: 1501.3781
Epoch 83/200

 1/11 [=>............................] - ETA: 0s - loss: 1063.5779

11/11 [==============================] - 0s 2ms/step - loss: 1570.5497 - val_loss: 1394.1536
Epoch 84/200

 1/11 [=>............................] - ETA: 0s - loss: 1799.8611

11/11 [==============================] - 0s 2ms/step - loss: 1581.1035 - val_loss: 1391.4716
Epoch 85/200

 1/11 [=>............................] - ETA: 0s - loss: 1362.3367

11/11 [==============================] - 0s 2ms/step - loss: 1574.3158 - val_loss: 1436.3286
Epoch 86/200

 1/11 [=>............................] - ETA: 0s - loss: 1258.4326

11/11 [==============================] - 0s 2ms/step - loss: 1573.9423 - val_loss: 1492.8945
Epoch 87/200

 1/11 [=>............................] - ETA: 0s - loss: 1660.5574

11/11 [==============================] - 0s 2ms/step - loss: 1564.5582 - val_loss: 1465.5223
Epoch 88/200

 1/11 [=>............................] - ETA: 0s - loss: 2200.0439

11/11 [==============================] - 0s 2ms/step - loss: 1561.2278 - val_loss: 1425.5232
Epoch 89/200

 1/11 [=>............................] - ETA: 0s - loss: 1224.3619

11/11 [==============================] - 0s 2ms/step - loss: 1557.1958 - val_loss: 1456.4805
Epoch 90/200

 1/11 [=>............................] - ETA: 0s - loss: 1333.5408

11/11 [==============================] - 0s 2ms/step - loss: 1545.6705 - val_loss: 1526.8835
Epoch 91/200

 1/11 [=>............................] - ETA: 0s - loss: 1163.9094

11/11 [==============================] - 0s 2ms/step - loss: 1545.1940 - val_loss: 1556.5034
Epoch 92/200

 1/11 [=>............................] - ETA: 0s - loss: 1516.9370

11/11 [==============================] - 0s 2ms/step - loss: 1560.7621 - val_loss: 1450.3468
Epoch 93/200

 1/11 [=>............................] - ETA: 0s - loss: 2113.3247

11/11 [==============================] - 0s 2ms/step - loss: 1534.0840 - val_loss: 1476.6825
Epoch 94/200

 1/11 [=>............................] - ETA: 0s - loss: 2470.8467

11/11 [==============================] - 0s 2ms/step - loss: 1535.1073 - val_loss: 1529.0774
Epoch 95/200

 1/11 [=>............................] - ETA: 0s - loss: 1126.9370

11/11 [==============================] - 0s 2ms/step - loss: 1548.3619 - val_loss: 1491.6451
Epoch 96/200

 1/11 [=>............................] - ETA: 0s - loss: 1792.6073

11/11 [==============================] - 0s 2ms/step - loss: 1531.4410 - val_loss: 1541.1406
Epoch 97/200

 1/11 [=>............................] - ETA: 0s - loss: 2214.8613

11/11 [==============================] - 0s 2ms/step - loss: 1526.9436 - val_loss: 1462.6487
Epoch 98/200
 1/11 [=>............................] - ETA: 0s - loss: 1503.0598

11/11 [==============================] - 0s 2ms/step - loss: 1523.1747 - val_loss: 1452.7174
Epoch 99/200
 1/11 [=>............................] - ETA: 0s - loss: 2093.6924

11/11 [==============================] - 0s 2ms/step - loss: 1529.6420 - val_loss: 1465.4390
Epoch 100/200

 1/11 [=>............................] - ETA: 0s - loss: 1613.6465

11/11 [==============================] - 0s 2ms/step - loss: 1528.8213 - val_loss: 1497.3292
Epoch 101/200

 1/11 [=>............................] - ETA: 0s - loss: 1031.4766

11/11 [==============================] - 0s 2ms/step - loss: 1527.2313 - val_loss: 1536.3185
Epoch 102/200

 1/11 [=>............................] - ETA: 0s - loss: 1278.9224

11/11 [==============================] - 0s 2ms/step - loss: 1523.4823 - val_loss: 1498.9176
Epoch 103/200

 1/11 [=>............................] - ETA: 0s - loss: 1926.5209

11/11 [==============================] - 0s 2ms/step - loss: 1523.5050 - val_loss: 1606.1659
Epoch 104/200

 1/11 [=>............................] - ETA: 0s - loss: 1484.9836

11/11 [==============================] - 0s 2ms/step - loss: 1512.0884 - val_loss: 1543.5570
Epoch 105/200

 1/11 [=>............................] - ETA: 0s - loss: 1205.7554

11/11 [==============================] - 0s 2ms/step - loss: 1515.5421 - val_loss: 1546.9349
Epoch 106/200

 1/11 [=>............................] - ETA: 0s - loss: 1810.6616

11/11 [==============================] - 0s 2ms/step - loss: 1501.8997 - val_loss: 1595.6852
Epoch 107/200

 1/11 [=>............................] - ETA: 0s - loss: 1289.5048

11/11 [==============================] - 0s 2ms/step - loss: 1501.8455 - val_loss: 1554.5890
Epoch 108/200
 1/11 [=>............................] - ETA: 0s - loss: 1917.0225

11/11 [==============================] - 0s 2ms/step - loss: 1510.1843 - val_loss: 1529.8917
Epoch 109/200

 1/11 [=>............................] - ETA: 0s - loss: 1687.9485

11/11 [==============================] - 0s 2ms/step - loss: 1504.5715 - val_loss: 1583.4867
Epoch 110/200

 1/11 [=>............................] - ETA: 0s - loss: 1321.0232

11/11 [==============================] - 0s 2ms/step - loss: 1497.2581 - val_loss: 1514.0756
Epoch 111/200

 1/11 [=>............................] - ETA: 0s - loss: 1643.1323

11/11 [==============================] - 0s 2ms/step - loss: 1510.2292 - val_loss: 1574.8502
Epoch 112/200

 1/11 [=>............................] - ETA: 0s - loss: 1393.6628

11/11 [==============================] - 0s 2ms/step - loss: 1486.2046 - val_loss: 1583.3119
Epoch 113/200

 1/11 [=>............................] - ETA: 0s - loss: 1381.6426

11/11 [==============================] - 0s 2ms/step - loss: 1491.0187 - val_loss: 1512.1046
Epoch 114/200

 1/11 [=>............................] - ETA: 0s - loss: 1566.4847

11/11 [==============================] - 0s 2ms/step - loss: 1497.9299 - val_loss: 1533.5482
Epoch 115/200

 1/11 [=>............................] - ETA: 0s - loss: 1623.5203

11/11 [==============================] - 0s 2ms/step - loss: 1498.8895 - val_loss: 1507.6411
Epoch 116/200

 1/11 [=>............................] - ETA: 0s - loss: 1285.0295

11/11 [==============================] - 0s 2ms/step - loss: 1494.6326 - val_loss: 1570.0703
Epoch 117/200

 1/11 [=>............................] - ETA: 0s - loss: 1545.9517

11/11 [==============================] - 0s 2ms/step - loss: 1508.1349 - val_loss: 1594.9520
Epoch 118/200

 1/11 [=>............................] - ETA: 0s - loss: 1444.8528

11/11 [==============================] - 0s 2ms/step - loss: 1479.5251 - val_loss: 1574.4032
Epoch 119/200

 1/11 [=>............................] - ETA: 0s - loss: 1268.3916

11/11 [==============================] - 0s 2ms/step - loss: 1488.3473 - val_loss: 1547.1967
Epoch 120/200

 1/11 [=>............................] - ETA: 0s - loss: 1826.1097

11/11 [==============================] - 0s 2ms/step - loss: 1483.8279 - val_loss: 1588.2277
Epoch 121/200

 1/11 [=>............................] - ETA: 0s - loss: 1317.6882

11/11 [==============================] - 0s 2ms/step - loss: 1482.0466 - val_loss: 1562.4755
Epoch 122/200

 1/11 [=>............................] - ETA: 0s - loss: 1551.8776

11/11 [==============================] - 0s 2ms/step - loss: 1491.1250 - val_loss: 1561.6012
Epoch 123/200

 1/11 [=>............................] - ETA: 0s - loss: 892.7714

11/11 [==============================] - 0s 2ms/step - loss: 1488.7075 - val_loss: 1592.8131
Epoch 124/200

 1/11 [=>............................] - ETA: 0s - loss: 1164.1816

11/11 [==============================] - 0s 2ms/step - loss: 1467.0310 - val_loss: 1571.5743
Epoch 125/200

 1/11 [=>............................] - ETA: 0s - loss: 1863.2065

11/11 [==============================] - 0s 2ms/step - loss: 1466.5305 - val_loss: 1637.2291
Epoch 126/200

 1/11 [=>............................] - ETA: 0s - loss: 1669.9719

11/11 [==============================] - 0s 2ms/step - loss: 1491.8440 - val_loss: 1577.2224
Epoch 127/200

 1/11 [=>............................] - ETA: 0s - loss: 1741.1499

11/11 [==============================] - 0s 2ms/step - loss: 1465.0649 - val_loss: 1583.8159
Epoch 128/200

 1/11 [=>............................] - ETA: 0s - loss: 1046.0842

11/11 [==============================] - 0s 2ms/step - loss: 1470.2394 - val_loss: 1594.0559
Epoch 129/200

 1/11 [=>............................] - ETA: 0s - loss: 1337.8582

11/11 [==============================] - 0s 2ms/step - loss: 1473.1925 - val_loss: 1602.6216
Epoch 130/200

 1/11 [=>............................] - ETA: 0s - loss: 1452.3589

11/11 [==============================] - 0s 2ms/step - loss: 1470.3301 - val_loss: 1573.1053
Epoch 131/200

 1/11 [=>............................] - ETA: 0s - loss: 900.6684

11/11 [==============================] - 0s 2ms/step - loss: 1474.7050 - val_loss: 1588.8188
Epoch 132/200

 1/11 [=>............................] - ETA: 0s - loss: 1480.3346

11/11 [==============================] - 0s 2ms/step - loss: 1469.0326 - val_loss: 1605.8528
Epoch 133/200

 1/11 [=>............................] - ETA: 0s - loss: 1917.2261

11/11 [==============================] - 0s 2ms/step - loss: 1454.7498 - val_loss: 1663.6276
Epoch 134/200

 1/11 [=>............................] - ETA: 0s - loss: 1711.2400

11/11 [==============================] - 0s 2ms/step - loss: 1472.8729 - val_loss: 1612.5951
Epoch 135/200

 1/11 [=>............................] - ETA: 0s - loss: 1468.8561

11/11 [==============================] - 0s 2ms/step - loss: 1468.4617 - val_loss: 1622.9133
Epoch 136/200

 1/11 [=>............................] - ETA: 0s - loss: 1245.1802

11/11 [==============================] - 0s 2ms/step - loss: 1475.6405 - val_loss: 1605.5909
Epoch 137/200

 1/11 [=>............................] - ETA: 0s - loss: 1609.5730

11/11 [==============================] - 0s 2ms/step - loss: 1458.8844 - val_loss: 1632.2683
Epoch 138/200

 1/11 [=>............................] - ETA: 0s - loss: 1686.6156

11/11 [==============================] - 0s 2ms/step - loss: 1461.0183 - val_loss: 1607.8722
Epoch 139/200

 1/11 [=>............................] - ETA: 0s - loss: 1740.3622

11/11 [==============================] - 0s 2ms/step - loss: 1464.7855 - val_loss: 1609.1526
Epoch 140/200

 1/11 [=>............................] - ETA: 0s - loss: 1840.6277

11/11 [==============================] - 0s 2ms/step - loss: 1466.7628 - val_loss: 1635.9297
Epoch 141/200

 1/11 [=>............................] - ETA: 0s - loss: 1935.0656

11/11 [==============================] - 0s 2ms/step - loss: 1468.1019 - val_loss: 1648.8243
Epoch 142/200

 1/11 [=>............................] - ETA: 0s - loss: 1198.0137

11/11 [==============================] - 0s 2ms/step - loss: 1457.9678 - val_loss: 1638.1875
Epoch 143/200

 1/11 [=>............................] - ETA: 0s - loss: 967.0182

11/11 [==============================] - 0s 2ms/step - loss: 1458.3104 - val_loss: 1647.0142
Epoch 144/200

 1/11 [=>............................] - ETA: 0s - loss: 1041.4209

11/11 [==============================] - 0s 2ms/step - loss: 1450.0575 - val_loss: 1612.1339
Epoch 145/200

 1/11 [=>............................] - ETA: 0s - loss: 1043.7056

11/11 [==============================] - 0s 2ms/step - loss: 1453.6185 - val_loss: 1632.3977
Epoch 146/200

 1/11 [=>............................] - ETA: 0s - loss: 1432.8466

11/11 [==============================] - 0s 2ms/step - loss: 1439.7356 - val_loss: 1627.0262
Epoch 147/200

 1/11 [=>............................] - ETA: 0s - loss: 1419.2288

11/11 [==============================] - 0s 2ms/step - loss: 1459.5042 - val_loss: 1643.0553
Epoch 148/200

 1/11 [=>............................] - ETA: 0s - loss: 1507.3357

11/11 [==============================] - 0s 2ms/step - loss: 1438.8228 - val_loss: 1708.0776
Epoch 149/200

 1/11 [=>............................] - ETA: 0s - loss: 1760.6677

11/11 [==============================] - 0s 2ms/step - loss: 1456.9601 - val_loss: 1662.2997
Epoch 150/200

 1/11 [=>............................] - ETA: 0s - loss: 1234.8677

11/11 [==============================] - 0s 2ms/step - loss: 1450.9950 - val_loss: 1637.8534
Epoch 151/200

 1/11 [=>............................] - ETA: 0s - loss: 1568.4287

11/11 [==============================] - 0s 2ms/step - loss: 1444.0192 - val_loss: 1648.7045
Epoch 152/200

 1/11 [=>............................] - ETA: 0s - loss: 1792.5715

11/11 [==============================] - 0s 2ms/step - loss: 1440.2759 - val_loss: 1663.6783
Epoch 153/200

 1/11 [=>............................] - ETA: 0s - loss: 1357.1970

11/11 [==============================] - 0s 2ms/step - loss: 1440.9316 - val_loss: 1673.6965
Epoch 154/200

 1/11 [=>............................] - ETA: 0s - loss: 1608.9795

11/11 [==============================] - 0s 2ms/step - loss: 1443.1624 - val_loss: 1644.4531
Epoch 155/200

 1/11 [=>............................] - ETA: 0s - loss: 1915.9012

11/11 [==============================] - 0s 2ms/step - loss: 1444.9686 - val_loss: 1643.9163
Epoch 156/200

 1/11 [=>............................] - ETA: 0s - loss: 1003.0920

11/11 [==============================] - 0s 2ms/step - loss: 1440.8553 - val_loss: 1638.5570
Epoch 157/200

 1/11 [=>............................] - ETA: 0s - loss: 1197.7260

11/11 [==============================] - 0s 2ms/step - loss: 1439.6317 - val_loss: 1667.5896
Epoch 158/200

 1/11 [=>............................] - ETA: 0s - loss: 1844.1990

11/11 [==============================] - 0s 2ms/step - loss: 1451.2251 - val_loss: 1666.1378
Epoch 159/200

 1/11 [=>............................] - ETA: 0s - loss: 1536.6626

11/11 [==============================] - 0s 2ms/step - loss: 1440.8701 - val_loss: 1657.0903
Epoch 160/200

 1/11 [=>............................] - ETA: 0s - loss: 1471.1005

11/11 [==============================] - 0s 2ms/step - loss: 1432.4332 - val_loss: 1663.3536
Epoch 161/200

 1/11 [=>............................] - ETA: 0s - loss: 1815.4309

11/11 [==============================] - 0s 2ms/step - loss: 1433.9142 - val_loss: 1666.0747
Epoch 162/200

 1/11 [=>............................] - ETA: 0s - loss: 1036.1960

11/11 [==============================] - 0s 2ms/step - loss: 1439.2948 - val_loss: 1683.9341
Epoch 163/200

 1/11 [=>............................] - ETA: 0s - loss: 1705.3892

11/11 [==============================] - 0s 2ms/step - loss: 1437.0526 - val_loss: 1652.8529
Epoch 164/200

 1/11 [=>............................] - ETA: 0s - loss: 1982.4498

11/11 [==============================] - 0s 2ms/step - loss: 1435.1355 - val_loss: 1665.3280
Epoch 165/200
 1/11 [=>............................] - ETA: 0s - loss: 1303.4935

11/11 [==============================] - 0s 2ms/step - loss: 1430.4878 - val_loss: 1679.3038
Epoch 166/200
 1/11 [=>............................] - ETA: 0s - loss: 1341.6897

11/11 [==============================] - 0s 2ms/step - loss: 1423.7057 - val_loss: 1738.3024
Epoch 167/200
 1/11 [=>............................] - ETA: 0s - loss: 1147.9730

11/11 [==============================] - 0s 2ms/step - loss: 1425.4886 - val_loss: 1671.9388
Epoch 168/200

 1/11 [=>............................] - ETA: 0s - loss: 1594.6624

11/11 [==============================] - 0s 2ms/step - loss: 1437.7329 - val_loss: 1663.7887
Epoch 169/200
 1/11 [=>............................] - ETA: 0s - loss: 1588.7053

11/11 [==============================] - 0s 2ms/step - loss: 1424.3811 - val_loss: 1749.5554
Epoch 170/200
 1/11 [=>............................] - ETA: 0s - loss: 1631.3878

11/11 [==============================] - 0s 2ms/step - loss: 1446.0881 - val_loss: 1672.2445
Epoch 171/200
 1/11 [=>............................] - ETA: 0s - loss: 1538.4336

11/11 [==============================] - 0s 2ms/step - loss: 1420.5936 - val_loss: 1671.8226
Epoch 172/200

 1/11 [=>............................] - ETA: 0s - loss: 1001.7332

11/11 [==============================] - 0s 2ms/step - loss: 1439.8226 - val_loss: 1669.8950
Epoch 173/200

 1/11 [=>............................] - ETA: 0s - loss: 1410.9131

11/11 [==============================] - 0s 2ms/step - loss: 1425.3586 - val_loss: 1672.7628
Epoch 174/200

 1/11 [=>............................] - ETA: 0s - loss: 1053.9016

11/11 [==============================] - 0s 2ms/step - loss: 1432.4529 - val_loss: 1678.6622
Epoch 175/200

 1/11 [=>............................] - ETA: 0s - loss: 1246.0857

11/11 [==============================] - 0s 2ms/step - loss: 1434.8838 - val_loss: 1668.8575
Epoch 176/200

 1/11 [=>............................] - ETA: 0s - loss: 1549.1364

11/11 [==============================] - 0s 2ms/step - loss: 1437.4459 - val_loss: 1688.0320
Epoch 177/200

 1/11 [=>............................] - ETA: 0s - loss: 1785.4211

11/11 [==============================] - 0s 2ms/step - loss: 1416.4330 - val_loss: 1706.6183
Epoch 178/200

 1/11 [=>............................] - ETA: 0s - loss: 863.6588

11/11 [==============================] - 0s 2ms/step - loss: 1416.2174 - val_loss: 1750.5310
Epoch 179/200
 1/11 [=>............................] - ETA: 0s - loss: 1554.0161

11/11 [==============================] - 0s 2ms/step - loss: 1436.9507 - val_loss: 1685.5170
Epoch 180/200
 1/11 [=>............................] - ETA: 0s - loss: 887.4516

11/11 [==============================] - 0s 2ms/step - loss: 1431.2572 - val_loss: 1684.2542
Epoch 181/200
 1/11 [=>............................] - ETA: 0s - loss: 1202.6226

11/11 [==============================] - 0s 2ms/step - loss: 1432.5452 - val_loss: 1688.7600
Epoch 182/200
 1/11 [=>............................] - ETA: 0s - loss: 849.3640

11/11 [==============================] - 0s 2ms/step - loss: 1416.5259 - val_loss: 1668.8086
Epoch 183/200
 1/11 [=>............................] - ETA: 0s - loss: 1258.4830

11/11 [==============================] - 0s 2ms/step - loss: 1434.9156 - val_loss: 1714.0836
Epoch 184/200
 1/11 [=>............................] - ETA: 0s - loss: 1903.2600

11/11 [==============================] - 0s 2ms/step - loss: 1428.5526 - val_loss: 1676.4127
Epoch 185/200

 1/11 [=>............................] - ETA: 0s - loss: 866.7876

11/11 [==============================] - 0s 2ms/step - loss: 1421.0732 - val_loss: 1656.5540
Epoch 186/200

 1/11 [=>............................] - ETA: 0s - loss: 1380.8625

11/11 [==============================] - 0s 2ms/step - loss: 1416.2097 - val_loss: 1730.7185
Epoch 187/200

 1/11 [=>............................] - ETA: 0s - loss: 1707.0049

11/11 [==============================] - 0s 2ms/step - loss: 1428.5590 - val_loss: 1666.9265
Epoch 188/200

 1/11 [=>............................] - ETA: 0s - loss: 1721.2739

11/11 [==============================] - 0s 2ms/step - loss: 1413.6920 - val_loss: 1721.7948
Epoch 189/200
 1/11 [=>............................] - ETA: 0s - loss: 1682.0562

11/11 [==============================] - 0s 2ms/step - loss: 1413.0726 - val_loss: 1679.6664
Epoch 190/200
 1/11 [=>............................] - ETA: 0s - loss: 1463.1976

11/11 [==============================] - 0s 2ms/step - loss: 1421.7323 - val_loss: 1687.9373
Epoch 191/200
 1/11 [=>............................] - ETA: 0s - loss: 1806.2244

11/11 [==============================] - 0s 2ms/step - loss: 1414.5066 - val_loss: 1681.8976
Epoch 192/200
 1/11 [=>............................] - ETA: 0s - loss: 1258.0668

11/11 [==============================] - 0s 2ms/step - loss: 1406.0670 - val_loss: 1720.4008
Epoch 193/200
 1/11 [=>............................] - ETA: 0s - loss: 1404.4884

11/11 [==============================] - 0s 2ms/step - loss: 1428.9807 - val_loss: 1702.1516
Epoch 194/200
 1/11 [=>............................] - ETA: 0s - loss: 1447.3972

11/11 [==============================] - 0s 2ms/step - loss: 1419.3719 - val_loss: 1687.3341
Epoch 195/200
 1/11 [=>............................] - ETA: 0s - loss: 1632.1079

11/11 [==============================] - 0s 2ms/step - loss: 1416.9988 - val_loss: 1727.9908
Epoch 196/200

 1/11 [=>............................] - ETA: 0s - loss: 1090.8004

11/11 [==============================] - 0s 2ms/step - loss: 1417.9749 - val_loss: 1701.5841
Epoch 197/200

 1/11 [=>............................] - ETA: 0s - loss: 935.6511

11/11 [==============================] - 0s 2ms/step - loss: 1413.2587 - val_loss: 1683.8630
Epoch 198/200
 1/11 [=>............................] - ETA: 0s - loss: 1825.1587

11/11 [==============================] - 0s 2ms/step - loss: 1418.5487 - val_loss: 1668.6381
Epoch 199/200
 1/11 [=>............................] - ETA: 0s - loss: 1331.9351

11/11 [==============================] - 0s 2ms/step - loss: 1420.4141 - val_loss: 1711.3992
Epoch 200/200
 1/11 [=>............................] - ETA: 0s - loss: 1481.9601

11/11 [==============================] - 0s 2ms/step - loss: 1408.7479 - val_loss: 1684.5994

We can see the additional information given for the loss when applying the current weights of each epoch to the validation data. Let’s have a look at it.

plt.plot(better_history.history['loss'], label='training')
plt.plot(better_history.history['val_loss'], label='validation')
plt.legend()
plt.title('loss')
plt.show()
_images/Python_introduction_231_0.png

We see a moderate, yet steady decrease in both curves before hitting a plateau of slight oscillation. The graphic shows, that we do not have to deal with over fitting and that our model seems to have converged.

Finally, let’s have a look at the predictions of the trained model again:

There are ways to handle over fitting but we will not discuss them here.

Finally, let’s have a look at the trained model again, using all the data:

y_pred = better_model.predict(x)
plt.scatter(x, y)
plt.plot(x,make_data(-.5,2,1.5,noise=False), c='red', lw=4)
plt.plot(x,y_pred.reshape(-1), c='orange', lw=4)
plt.show()
_images/Python_introduction_234_0.png

At last, we see that our more complex model is very well able to find the underlying function which we used to generate our data.

While we have only covered a tiny bit of TensorFlow and Keras, the syntax stays the same for different kinds of networks. For example, to build a convolutional artificial neural network, we would use keras.Conv2D() instead of keras.Dense. For more types of layers and how to build more complex architectures, visit the documentation referred to above.