python notes

Tags: , ,

The following(incomplete) resources were utilised to develop the snippets and notes below. Other links are also available inline with the text, and I am working on using org-ref for a proper bibliography system (where possible).

Platform independent shell commands like open

Note here the use of platform.system() to identify the platform. The subprocess.call command is used to launch a subprocess as a separate thread.

# Once the folder is created, I want to download pictures of cats
# for this I need a folder location that is available and then need a URL to download from
def display_cats(folder):
    # open folder
    print("Displaying cats in OS window.")
    if platform.system() == "Darwin":
        subprocess.call(["open", folder])
    elif platform.system() == "Windows":
        subprocess.call(["explorer", folder])
    elif platform.system() == "Linux":
        subprocess.call(["xdg-open", folder])
    else:
        print("We don't support your os: " + platform.system())

Python example functions to download a file via URL

import shutil
import requests

def get_cat(
    folder,
    name,
    url="http://consuming-python-services-api.azurewebsites.net/cats/random",
):
    """This function will download data from a url and save the file. In
this case download cat pictures

    """
    url = url
    data = get_data_from_url(url)
    save_image(folder, name, data)


def get_data_from_url():
# Note the stream argument. This is required to store the actual file
# contents
    response = requests.get(url, stream=True)
    print("Status code is {}".format(response.status_code))
    return response.raw


def save_image(folder, name, data):
    file_name = os.path.join(folder, name, data) + ".jpg"
    with open(file_name, "wb") as f:
        shutil.copyfileobj(data, f)

Python truthiness

Empty lists, arrays, dictionaries, zero ints, zero floats, None, null, nil pointers are all deemed to be False in python

Conditional expression has lower precedence than other operators

Example: here the plus operators would be executed first before the conditional is evaluated.

'a' + 'x' if '123'.isdigit() else 'y' + 'b'

Running modules with python -m

How to Run Your Python Scripts – Real Python

The -m option searches sys.path for the module name and runs its content as __main__. Note: module-name needs to be the name of a module object, not a string.

Function can be defined and then assigned later

def func(x):
    return x + 1

f = func
print(f(2) + func(2))
6

String join is different from append

Note here how the result is papaya. The a is joined between each element of the list. i.e join is different from append.

The list to join is always the sole input to join(), which is called on the string you want to join with.

string = 'ppy!'
fruit = 'a'.join(list(string))
print(fruit)
papaya!

String to list

If a sting is converted to a list without any split parameters assigned, then each letter of the string is an item on the list. This makes it simple to work on the characters of a string as required, and each character can then be appended to a list to reconstruct the sentence.

Note then that the whitespaces between words are also captured and this is why the sentence can be reconstructed meaningfully.

text = "The Zen of Python, by Tim Peters"
print(list(text))

# Using a split command on a string will result in a list with the items
# decided based on the split character used.
print(text.split())
['T', 'h', 'e', ' ', 'Z', 'e', 'n', ' ', 'o', 'f', ' ', 'P', 'y', 't', 'h', 'o', 'n', ',', ' ', 'b', 'y', ' ', 'T', 'i', 'm', ' ', 'P', 'e', 't', 'e', 'r', 's']
['The', 'Zen', 'of', 'Python,', 'by', 'Tim', 'Peters']

Objects of type float should not be compared with the equality operator

Instead specify a tolerance and check the difference is within the tolerance

tolerance = 0.001
print(1.1 + 2.2 == 3.3)
comparison = abs((1.1 + 2.2) - 3.3) < tolerance
print(comparison)
False
True

Daily coding problem book

This section contains the solutions outlined in the daily coding problem book with my comments and notes and tasks for further exploration.

The idea to practice good elements of writing code, starting with doctrings and comments and assessing the order of time and space complexities.

Smallest window

Find the smallest window within an array on the path to sorting an array.

Note: A smart Loop initialization and choosing the right kind of variable is the key.

# method 1 use the sorted method available in python. This method is O(n log n)
def smallest_window(input_array):
    left, right = None, None
    s = sorted(input_array)
    # Now we know what the sorted array looks like. However this means
    # an extra complexity in space.

    for i in range(len(input_array)):
        # for the first iteration, left will be None. Therefore if the
        # item is not equal it will be assigned to left.  Here left or
        # right is not related to ascending or descending. THe logic is
        # simply deciding to move the number to ~some direction based on
        # an equality condition.
        if input_array[i] != s[i] and left is None:
            left = i
        elif input_array[i] != s[i]:
            right = i
    return left, right

# method 2 by traversing the array
def smallest_window_traverse(input_array):
    left, right = None, None
    n = len(input_array)
    max_seen, min_seen = -float("inf"), float("inf")

    for i in range(n):
        max_seen = max(max_seen, input_array[i])
        # here max_seen is the running maximum of the array. It starts
        # with - inf and the array[i]. Therefore array[i] will be
        # greater in general. THis is the new running maximum for
        # comparing the next element.
        if input_array[i] < max_seen:
            right = i

    for i in range(n-1, -1 , -1):
        min_seen = min(min_seen, input_array[i])
        if min_seen > input_array[i]:
            left = i
            # How is this the smallest window? The reason is the same
            # array is being traversed from both directions. The window
            # itself is created only if the numbers have to be moved at
            # all basd on whether the max or min criteria is shown to be
            # true.
    return left, right
a = [ 33,1, 444, 555,12,11, 666]
sorted_a = sorted(a)
print(sorted_a)
smallest_window(a)
b = -float("inf")
print(b)

[1, 11, 12, 33, 444, 555, 666] -inf

Maximum Subarray sum

Given an array calculate the maximum sum of any contiguous subarray

# brute force of considering every subarray combination
def brute_force_subarray_sum(array):
    current_max = 0
    # the range for the outer loop has to be length -1 because it can
    # stop at the previous to last element. The inner loop will add the
    # last element to the previous to last element. If the outer loop
    # went up to the length of the array, then the final inner loop
    # iteration would be range(len(array), len(array)). As such this
    # makes no difference to results.
    for i in range(len(array)):
        for j in range(i, len(array)+1):
            current_max = max(current_max, sum(array[i:j]))
            print(sum(array[i:j]))
    return current_max


# Implement kadane's algorithm
def kadane_subarray_sum(array):
    maxSoFar, maxHere = 0, 0
    for item in array:
        maxHere = max(item, maxHere + item)
        print(f"Max at point is {maxHere}")
        maxSoFar = max(maxSoFar, maxHere)
        print(f"Max so far is {maxSoFar}")
    return maxSoFar

a = [1, 4 ,5, 6, 7]
print(f"Kadane max subarray is {kadane_subarray_sum(a)}")
print(f"Brute force subarray is {brute_force_subarray_sum(a)}")

Max at point is 1 Max so far is 1 Max at point is 5 Max so far is 5 Max at point is 10 Max so far is 10 Max at point is 16 Max so far is 16 Max at point is 23 Max so far is 23 Kadane max subarray is 23 0 1 5 10 16 23 0 4 9 15 22 0 5 11 18 0 6 13 0 7 Brute force subarray is 23

Array product manipulation

def products(nums):
    """Given an input of numbers in an array - generate a new array wih the
    products of all the items in the input, excluding the item at the
    corresponding index. Division is not to be used.

    The methodology : for each item in the given array, another 2 arrays
    of corresponding prefix and suffix products are created. Then all
    that has to be done is multiply corresponding elements of the prefix
    and suffix for each element of the input provided.

    """
    # Generate prefix products.  Here prefix_products[-1] * num refers
    # to the previous entry in the list, which is already a product of
    # the /corresponding/ prefixes.  Also note here the way in which the
    # empty list is created and the conditional takes advantage of this
    # fact. This allows the very first element to stay in the loop.
    prefix_products = []
    for num in nums:
        if prefix_products:
            prefix_products.append(prefix_products[-1] * num)
        else:
            prefix_products.append(num)

    # Generate suffix products. One key here is in using the
    # reversed(). This enables the function between prefix and suffix to
    # stay largely the same. 
    suffix_products = []
    for num in reversed(nums):
        if suffix_products:
            suffix_products.append(suffix_products[-1] * num)
        else:
            suffix_products.append(num)
            # Note how the list is being reversed again to bring it i order
    suffix_products = list(reversed(suffix_products))

    # Generate the results from product of prefix and suffix
    results = []
    for i in range(len(nums)):
        # at the starting of the list, there is no prefix. Therefore
        # only a product of the suffix is required, and from the next
        # index.
        if i == 0:
            results.append(suffix_products[i + 1])
        # at the end of the list, there is no suffix, and hence only the
        # prefix products are required.
        elif i == len(nums) - 1:
            results.append(prefix_products[i - 1])
        else:
            results.append(prefix_products[i - 1] * suffix_products[i + 1])
    return results

Using Division

def products_elements_division(array):
    product = 1
    for i in range(len(array)):
        product = product * array[i]

    productOtherElements = []
    for i in range(len(array)):
        productOtherElements.append(product/array[i])
    return productOtherElements
a = [1,2,3,4]
products_elements_division(a)

Palindrome function

Note: Condition check for a palindrome using any permutation of the strings: Each character appears an even number of times allowing only one character to appear an odd number of times (middle character).

def is_palindrome(word):
    """Function to take in a word, convert it to lower case and check whether the word is a palindrome."""
    word =  word.lower()
    if word[-1::-1] == word:
        return "true"
    else: 
        return "false"
def is_palindrome(word):
    """Function to take in a word, convert it to lower case and check whether the word is a palindrome."""
    word =  word.lower()
    if word[-1::-1] == word:
        return "true"
    else: 
        return "false"
word = "Deleveled"
is_palindrome(word)

While, break and continue

This example from a free pybites exercise.

VALID_COLORS = ['blue', 'yellow', 'red']


def print_colors():
    """In the while loop ask the user to enter a color,
       lowercase it and store it in a variable. Next check:
       - if 'quit' was entered for color, print 'bye' and break.
       - if the color is not in VALID_COLORS, print 'Not a valid color' and continue.
       - otherwise print the color in lower case."""

    # The following loop will continue as long as the expression is true.
    while True:
        color = input("Enter color\n").lower()
        if color == "quit":
            print("bye")
            break
        if color not in VALID_COLORS:
            print("Not a valid color")
            continue

        print(color)
        pass

print_colors()

f-strings and str.format()

f-strings are a new way to print things in python 3.6. It builds on str.format(), and simplifies the syntax.

name = "Fred"
print(f"He said his name is {name}.")
# the print() is not necessary on the interpreter.
# The variable can be directly defined within the curly brackets without the need to call the format method

# The earlier str.format() option looked like this
name2 = "Astaire"
print("His name is {}, and his friend's name is {}".format(name, name2))
He said his name is Fred.
His name is Fred, and his friend's name is Astaire

General python overview

Syntax, type and operator notes

        print(-5 ** 2)
        print((-5)**2)
        import sys
        print(sys.executable)
    
        -25
        25
        /home/shrysr/miniconda3/envs/min_ds/bin/python
a = 6
print(id(a))
94222151525568
94222151525664
> &#x2026; a Unicode string is a sequence of code points, which are numbers from 0 through 0x10FFFF (1,114,111 decimal). This sequence of code points needs to be represented in memory as a set of code units, and code units are then mapped to 8-bit bytes. The rules for translating a Unicode string into a sequence of bytes are called a character encoding, or just an encoding.

Reasons for not using Python

Perenthesis to chain methods

Enclosing the entire command within parenthesis is necessary in python, because whitespace matters (in python).

# using parethenses to put methods on different lines
(s.rstrip('?!')
  .lower()
  .replace('t', 'a')
  .count('e'))

Pandas

Frequency of occurence .value_counts()

groupby : same as group_by in R

agg() : similar to summarize in R

ins.groupby('sex').agg({'charges': ['mean', 'max', 'count']}).round(0)

pivot_table() : comparison across groups made easier

pt = ins.pivot_table(index='sex', columns='region',
                     values='charges', aggfunc='mean').round(0)

Test example

ins = read_csv("~/my_projects/

DataFrame? and Series

DataFrame?: 2D, rectangular data structure Series: A single dimension of data. Similar to a single column of data, or a 1D array.

Concise notes of the pep 8 style guide

Naming conventions

Package and Module Names Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.

When an extension module written in C or C++ has an accompanying Python module that provides a higher level (e.g. more object oriented) interface, the C/C++ module has a leading underscore (e.g. \socket).

Class Names Class names should normally use the CapWords? convention.

The naming convention for functions may be used instead in cases where the interface is documented and used primarily as a callable.

Note that there is a separate convention for builtin names: most builtin names are single words (or two words run together), with the CapWords? convention used only for exception names and builtin constants.

Function and Variable Names Function names should be lowercase, with words separated by underscores as necessary to improve readability.

Variable names follow the same convention as function names.

Method Names and Instance Variables Use the function naming rules: lowercase with words separated by underscores as necessary to improve readability.

Use one leading underscore only for non-public methods and instance variables.

Constants Constants are usually defined on a module level and written in all capital letters with underscores separating words. Examples include MAX<sub>OVERFLOW</sub> and TOTAL.

Using try and except

It is possible to try out a set of instructions and then store the exceptions that crop up. The exceptions that are stored in python also have specific names. It is possible to use an exception to check for that particular name and store it in a variable.

try:
    schedule_file = open('schedule.txt', 'r')

except FileNotFoundError as err:
    print (err)
[Errno 2] No such file or directory: 'schedule.txt'

Notes on Functions

A list of functions / dictionaries :dictionary:

Source: Dan Bader @ Real Python

A function can be evaluated and returned to a particular variable in a list. This is essentially creating a list of functions that can be easily called or updated as required.

def addition(a,b):
    return a+b

def subtraction(a,b):
    return a-b

def multiplication(a,b):
    return a*b

func_list = [addition, subtraction, multiplication]
print(func_list[0](3,2))
print(func_list[1](4,5))
print(func_list[2](6,4))
5
-1
24

Expand shortened pathnames expanduser

from os.path import expanduser
print(expanduser('~/my_org/journal'))
/Users/shreyas/my_org/journal

Lists

List comprehension

Source: bonus content videos of Real Python

# Example of directly defining a list using a shorthand for loop
import pandas as pd
print([x * x for x in range(10)])

# Basically the above is the same as:

squares = []
for x in range(10):
    squares.append(x * x)

print(squares)


# modifying the original code

squares2 = [x * x for x in range(10)]
print(squares2)

ModuleNotFoundErrorTraceback? (most recent call last)

ipython-input-3-eac809631b48
in <module> 1 # Example of directly defining a list using a shorthand for loop -—> 2 import pandas as pd 3 print([x * x for x in range(10)]) 4 5 # Basically the above is the same as:

ModuleNotFoundError?: No module named 'pandas'

Format: new_list = [expression for member in iterable if condition: ]

squares = [i * i for i in range(10)]
squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Format with a conditional and a loop. Start with the for loop and then add the conditional. The first keyword will be the item that is returned.

def filter_positive_even_numbers(numbers):
    """Receives a list of numbers, and returns a filtered list of only the
       numbers that are both positive and even (divisible by 2), try to use a
       list comprehension."""
       
    return [number for number in numbers if (number >0) and (number %2 == 0)] 

A list generator produces a list and a map produces a map object.

Using the assert method :assert:

Simple example

# Perform calculation only if a >= 100 and b < 200

a = 20
b = 200

def test_func(a,b):
    assert (a >=  100) &  (b <= 200)
    return (a + b)

print(test_func(a,b))
# Building on the same example as in the book

def apply_discount(product, discount, threshold):
    price = int(product['price'] * (1.0 * discount))
    assert threshold <= price <= product['price']
    return price

# Defining the product dictionary. One of the keys in the dictionary have to be price

shoes = {'name': 'adidas', 'price': 14900}

# Applying a 10% discount on shoes, and defining the threshold to be 1500
apply_discount(shoes, 10/100, threshold = 1500)
AssertionErrorTraceback (most recent call last)
<ipython-input-17-a8990c61cc1b> in <module>()
     11 shoes = {'name': 'adidas', 'price': 14900}
     12
---> 13 apply_discount(shoes, 10/100, threshold = 1500)

<ipython-input-17-a8990c61cc1b> in apply_discount(product, discount, threshold)
      4 def apply_discount(product, discount, threshold):
      5     price = int(product['price'] * (1.0 * discount))
----> 6     assert threshold <= price <= product['price']
      7     return price
      8

AssertionError:

Strings :strings:

General notes

Some methods associated with manipulating strings:

cap_me = "hello this is a normal string"
print(cap_me.upper())
print(cap_me.count('i'))
3_d = "hello"

the quotes surrounding a string are called delimiters, which tell python where the string starts and ends.

Notes on String formatting

#This is a comment that is is not interpreted.
first= 'monty'
second='python'
total= first + " " + second
print (total)
print (first,"", second)
#notice how an extra space is added in the 2nd line. This basically means that the space is added automatically.
print (first,second,'. This is corrected.')
monty python
monty  python
monty python . This is corrected.
print("test")

test

print("This is Mac's notebook")
#if i had used a single quote to enclose the string, then the string would have terminated at Mac's. Knowing this is useful with respect to string manipulation.
This is Mac's notebook
number = 1
string = str(number)
print (type (string))
print (type (number))
<class 'str'>
<class 'int'>
movie1 = "Clear and present danger"
movie2 = "tom dick and harry"
print ("My favorite movies \n \t", movie1, "\n \t", movie2)
My favorite movies
       Clear and present danger
       tom dick and harry

Notes on String manipulation

string1= 'Ragavan'
print(string1[0])
print(string1[4])
print(string1[3:]) #this is to print a range of the characters
R
v
avan
string1= 'Shreyas Ragavan'
print (type(len(string1)))
#note that the space is included as a count
string1= 'Shreyas Ragavan'
s2= string1[2:]
s3=string1[4:6]
s4=string1[2:5]
print (s2)
print (s3)
print (s4)
reyas Ragavan
ya
rey
s1="Ragavan"
s2="Jayanthi"
s1_5= len(s1)//2
s2_5= len(s2)//2
print (s1[s1_5:], s2[s2_5:])
<class 'int'>
avan nthi

Test program

word = 'Python'
first=word[0]
rest=word[1:]
result=rest + "-" + first +"y"
print (result)
ython-Py

Find your python version

    export PATH="/Users/shreyas/Applications/anaconda/bin:$PATH"
    export PATH="/Users/shreyas/anaconda_install/anaconda3/bin:$PATH"
source ~/.bash_profile

It is important to verify that the intended python interpreter version is being used.

Using the command line (shell):

python --version

Using the sys module in a python program.

import sys

print(sys.version_info)
print(sys.version)
sys.version_info(major=3, minor=6, micro=5, releaselevel='final', serial=0)
3.6.5 |Anaconda, Inc.| (default, Apr 26 2018, 08:42:37)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]

Finding the versions of scipy, numpy, pandas, scikit-learn, matplotlib

import scipy
import numpy
import matplotlib
import pandas
import sklearn

print('scipy: %s' %scipy.__version__)
print('Numpy: %s' %numpy.__version__)
print('Matplotlib: %s' %matplotlib.__version__)
print('Pandas: %s' %pandas.__version__)
print('scikit-learn: %s' %sklearn.__version__)

scipy: 1.2.0 Numpy: 1.15.4 Matplotlib: 3.0.2 Pandas: 0.24.1 scikit-learn: 0.20.2

Commenting in Python

Multi-line comments can be done with commenting out individual lines, or using the triple quotes. Depending on the location of hte multi-line comments - these triple quotes could turn into doctrings. When in doubt, add a # on the subsequent line.

# single line. This is called a block comment
# another single line

"""
This is a multiple line comment
and is enclosed within triple quotes.
I am using org mode in Emacs
"""

print("hello world")  # this is an inline comment

python debugger

Once you have an idea of where things might be breaking down, insert the following line of code into your script import pdb; pdb.set<sub>trace</sub>() and run it. This is the Python debugger and will drop you into interactive mode. The debugger can also be run from the command line with python -m pdb <my<sub>file.py</sub>>.
- <https://realpython.com/python-beginner-tips/>

The following resources were utilised to develop the snippets and notes below. Other links are also available inline with the text, and I am working on using org-ref for a proper bibliography system (where possible).

Platform independent shell commands like open

Note here the use of platform.system() to identify the platform. The subprocess.call command is used to launch a subprocess as a separate thread.

# Once the folder is created, I want to download pictures of cats
# for this I need a folder location that is available and then need a URL to download from
def display_cats(folder):
    # open folder
    print("Displaying cats in OS window.")
    if platform.system() == "Darwin":
        subprocess.call(["open", folder])
    elif platform.system() == "Windows":
        subprocess.call(["explorer", folder])
    elif platform.system() == "Linux":
        subprocess.call(["xdg-open", folder])
    else:
        print("We don't support your os: " + platform.system())

Python example functions to download a file via URL

import shutil
import requests

def get_cat(
    folder,
    name,
    url="http://consuming-python-services-api.azurewebsites.net/cats/random",
):
    """This function will download data from a url and save the file. In
this case download cat pictures

    """
    url = url
    data = get_data_from_url(url)
    save_image(folder, name, data)


def get_data_from_url():
# Note the stream argument. This is required to store the actual file
# contents
    response = requests.get(url, stream=True)
    print("Status code is {}".format(response.status_code))
    return response.raw


def save_image(folder, name, data):
    file_name = os.path.join(folder, name, data) + ".jpg"
    with open(file_name, "wb") as f:
        shutil.copyfileobj(data, f)

Python truthiness

Empty lists, arrays, dictionaries, zero ints, zero floats, None, null, nil pointers are all deemed to be False in python

Conditional expression has lower precedence than other operators

Example: here the plus operators would be executed first before the conditional is evaluated.

'a' + 'x' if '123'.isdigit() else 'y' + 'b'

Running modules with python -m

How to Run Your Python Scripts – Real Python

The -m option searches sys.path for the module name and runs its content as __main__. Note: module-name needs to be the name of a module object, not a string.

Function can be defined and then assigned later

def func(x):
    return x + 1

f = func
print(f(2) + func(2))
6

String join is different from append

Note here how the result is papaya. The a is joined between each element of the list. i.e join is different from append.

The list to join is always the sole input to join(), which is called on the string you want to join with.

string = 'ppy!'
fruit = 'a'.join(list(string))
print(fruit)
papaya!

String to list

If a sting is converted to a list without any split parameters assigned, then each letter of the string is an item on the list. This makes it simple to work on the characters of a string as required, and each character can then be appended to a list to reconstruct the sentence.

Note then that the whitespaces between words are also captured and this is why the sentence can be reconstructed meaningfully.

text = "The Zen of Python, by Tim Peters"
print(list(text))

# Using a split command on a string will result in a list with the items
# decided based on the split character used.
print(text.split())
['T', 'h', 'e', ' ', 'Z', 'e', 'n', ' ', 'o', 'f', ' ', 'P', 'y', 't', 'h', 'o', 'n', ',', ' ', 'b', 'y', ' ', 'T', 'i', 'm', ' ', 'P', 'e', 't', 'e', 'r', 's']
['The', 'Zen', 'of', 'Python,', 'by', 'Tim', 'Peters']

Objects of type float should not be compared with the equality operator

Instead specify a tolerance and check the difference is within the tolerance

tolerance = 0.001
print(1.1 + 2.2 == 3.3)
comparison = abs((1.1 + 2.2) - 3.3) < tolerance
print(comparison)
False
True

Daily coding problem book

This section contains the solutions outlined in the daily coding problem book with my comments and notes and tasks for further exploration.

The idea to practice good elements of writing code, starting with doctrings and comments and assessing the order of time and space complexities.

Smallest window

Find the smallest window within an array on the path to sorting an array.

Note: A smart Loop initialization and choosing the right kind of variable is the key.

# method 1 use the sorted method available in python. This method is O(n log n)
def smallest_window(input_array):
    left, right = None, None
    s = sorted(input_array)
    # Now we know what the sorted array looks like. However this means
    # an extra complexity in space.

    for i in range(len(input_array)):
        # for the first iteration, left will be None. Therefore if the
        # item is not equal it will be assigned to left.  Here left or
        # right is not related to ascending or descending. THe logic is
        # simply deciding to move the number to ~some direction based on
        # an equality condition.
        if input_array[i] != s[i] and left is None:
            left = i
        elif input_array[i] != s[i]:
            right = i
    return left, right

# method 2 by traversing the array
def smallest_window_traverse(input_array):
    left, right = None, None
    n = len(input_array)
    max_seen, min_seen = -float("inf"), float("inf")

    for i in range(n):
        max_seen = max(max_seen, input_array[i])
        # here max_seen is the running maximum of the array. It starts
        # with - inf and the array[i]. Therefore array[i] will be
        # greater in general. THis is the new running maximum for
        # comparing the next element.
        if input_array[i] < max_seen:
            right = i

    for i in range(n-1, -1 , -1):
        min_seen = min(min_seen, input_array[i])
        if min_seen > input_array[i]:
            left = i
            # How is this the smallest window? The reason is the same
            # array is being traversed from both directions. The window
            # itself is created only if the numbers have to be moved at
            # all basd on whether the max or min criteria is shown to be
            # true.
    return left, right
a = [ 33,1, 444, 555,12,11, 666]
sorted_a = sorted(a)
print(sorted_a)
smallest_window(a)
b = -float("inf")
print(b)

[1, 11, 12, 33, 444, 555, 666] -inf

Maximum Subarray sum

Given an array calculate the maximum sum of any contiguous subarray

# brute force of considering every subarray combination
def brute_force_subarray_sum(array):
    current_max = 0
    # the range for the outer loop has to be length -1 because it can
    # stop at the previous to last element. The inner loop will add the
    # last element to the previous to last element. If the outer loop
    # went up to the length of the array, then the final inner loop
    # iteration would be range(len(array), len(array)). As such this
    # makes no difference to results.
    for i in range(len(array)):
        for j in range(i, len(array)+1):
            current_max = max(current_max, sum(array[i:j]))
            print(sum(array[i:j]))
    return current_max


# Implement kadane's algorithm
def kadane_subarray_sum(array):
    maxSoFar, maxHere = 0, 0
    for item in array:
        maxHere = max(item, maxHere + item)
        print(f"Max at point is {maxHere}")
        maxSoFar = max(maxSoFar, maxHere)
        print(f"Max so far is {maxSoFar}")
    return maxSoFar

a = [1, 4 ,5, 6, 7]
print(f"Kadane max subarray is {kadane_subarray_sum(a)}")
print(f"Brute force subarray is {brute_force_subarray_sum(a)}")

Max at point is 1 Max so far is 1 Max at point is 5 Max so far is 5 Max at point is 10 Max so far is 10 Max at point is 16 Max so far is 16 Max at point is 23 Max so far is 23 Kadane max subarray is 23 0 1 5 10 16 23 0 4 9 15 22 0 5 11 18 0 6 13 0 7 Brute force subarray is 23

Array product manipulation

def products(nums):
    """Given an input of numbers in an array - generate a new array wih the
    products of all the items in the input, excluding the item at the
    corresponding index. Division is not to be used.

    The methodology : for each item in the given array, another 2 arrays
    of corresponding prefix and suffix products are created. Then all
    that has to be done is multiply corresponding elements of the prefix
    and suffix for each element of the input provided.

    """
    # Generate prefix products.  Here prefix_products[-1] * num refers
    # to the previous entry in the list, which is already a product of
    # the /corresponding/ prefixes.  Also note here the way in which the
    # empty list is created and the conditional takes advantage of this
    # fact. This allows the very first element to stay in the loop.
    prefix_products = []
    for num in nums:
        if prefix_products:
            prefix_products.append(prefix_products[-1] * num)
        else:
            prefix_products.append(num)

    # Generate suffix products. One key here is in using the
    # reversed(). This enables the function between prefix and suffix to
    # stay largely the same. 
    suffix_products = []
    for num in reversed(nums):
        if suffix_products:
            suffix_products.append(suffix_products[-1] * num)
        else:
            suffix_products.append(num)
            # Note how the list is being reversed again to bring it i order
    suffix_products = list(reversed(suffix_products))

    # Generate the results from product of prefix and suffix
    results = []
    for i in range(len(nums)):
        # at the starting of the list, there is no prefix. Therefore
        # only a product of the suffix is required, and from the next
        # index.
        if i == 0:
            results.append(suffix_products[i + 1])
        # at the end of the list, there is no suffix, and hence only the
        # prefix products are required.
        elif i == len(nums) - 1:
            results.append(prefix_products[i - 1])
        else:
            results.append(prefix_products[i - 1] * suffix_products[i + 1])
    return results

Using Division

def products_elements_division(array):
    product = 1
    for i in range(len(array)):
        product = product * array[i]

    productOtherElements = []
    for i in range(len(array)):
        productOtherElements.append(product/array[i])
    return productOtherElements
a = [1,2,3,4]
products_elements_division(a)

Palindrome function

Note: Condition check for a palindrome using any permutation of the strings: Each character appears an even number of times allowing only one character to appear an odd number of times (middle character).

def is_palindrome(word):
    """Function to take in a word, convert it to lower case and check whether the word is a palindrome."""
    word =  word.lower()
    if word[-1::-1] == word:
        return "true"
    else: 
        return "false"
def is_palindrome(word):
    """Function to take in a word, convert it to lower case and check whether the word is a palindrome."""
    word =  word.lower()
    if word[-1::-1] == word:
        return "true"
    else: 
        return "false"
word = "Deleveled"
is_palindrome(word)

While, break and continue

This example from a free pybites exercise.

VALID_COLORS = ['blue', 'yellow', 'red']


def print_colors():
    """In the while loop ask the user to enter a color,
       lowercase it and store it in a variable. Next check:
       - if 'quit' was entered for color, print 'bye' and break.
       - if the color is not in VALID_COLORS, print 'Not a valid color' and continue.
       - otherwise print the color in lower case."""

    # The following loop will continue as long as the expression is true.
    while True:
        color = input("Enter color\n").lower()
        if color == "quit":
            print("bye")
            break
        if color not in VALID_COLORS:
            print("Not a valid color")
            continue

        print(color)
        pass

print_colors()

f-strings and str.format()

f-strings are a new way to print things in python 3.6. It builds on str.format(), and simplifies the syntax.

name = "Fred"
print(f"He said his name is {name}.")
# the print() is not necessary on the interpreter.
# The variable can be directly defined within the curly brackets without the need to call the format method

# The earlier str.format() option looked like this
name2 = "Astaire"
print("His name is {}, and his friend's name is {}".format(name, name2))
He said his name is Fred.
His name is Fred, and his friend's name is Astaire

General python overview

Syntax, type and operator notes

        print(-5 ** 2)
        print((-5)**2)
        import sys
        print(sys.executable)
    
        -25
        25
        /home/shrysr/miniconda3/envs/min_ds/bin/python
a = 6
print(id(a))
94222151525568
94222151525664
> &#x2026; a Unicode string is a sequence of code points, which are numbers from 0 through 0x10FFFF (1,114,111 decimal). This sequence of code points needs to be represented in memory as a set of code units, and code units are then mapped to 8-bit bytes. The rules for translating a Unicode string into a sequence of bytes are called a character encoding, or just an encoding.

Reasons for not using Python

Perenthesis to chain methods

Enclosing the entire command within parenthesis is necessary in python, because whitespace matters (in python).

# using parethenses to put methods on different lines
(s.rstrip('?!')
  .lower()
  .replace('t', 'a')
  .count('e'))

Pandas

Frequency of occurence .value_counts()

groupby : same as group_by in R

agg() : similar to summarize in R

ins.groupby('sex').agg({'charges': ['mean', 'max', 'count']}).round(0)

pivot_table() : comparison across groups made easier

pt = ins.pivot_table(index='sex', columns='region',
                     values='charges', aggfunc='mean').round(0)

Test example

ins = read_csv("~/my_projects/

DataFrame? and Series

DataFrame?: 2D, rectangular data structure Series: A single dimension of data. Similar to a single column of data, or a 1D array.

Concise notes of the pep 8 style guide

Naming conventions

Package and Module Names Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.

When an extension module written in C or C++ has an accompanying Python module that provides a higher level (e.g. more object oriented) interface, the C/C++ module has a leading underscore (e.g. \socket).

Class Names Class names should normally use the CapWords? convention.

The naming convention for functions may be used instead in cases where the interface is documented and used primarily as a callable.

Note that there is a separate convention for builtin names: most builtin names are single words (or two words run together), with the CapWords? convention used only for exception names and builtin constants.

Function and Variable Names Function names should be lowercase, with words separated by underscores as necessary to improve readability.

Variable names follow the same convention as function names.

Method Names and Instance Variables Use the function naming rules: lowercase with words separated by underscores as necessary to improve readability.

Use one leading underscore only for non-public methods and instance variables.

Constants Constants are usually defined on a module level and written in all capital letters with underscores separating words. Examples include MAX<sub>OVERFLOW</sub> and TOTAL.

Using try and except

It is possible to try out a set of instructions and then store the exceptions that crop up. The exceptions that are stored in python also have specific names. It is possible to use an exception to check for that particular name and store it in a variable.

try:
    schedule_file = open('schedule.txt', 'r')

except FileNotFoundError as err:
    print (err)
[Errno 2] No such file or directory: 'schedule.txt'

Notes on Functions

A list of functions / dictionaries :dictionary:

Source: Dan Bader @ Real Python

A function can be evaluated and returned to a particular variable in a list. This is essentially creating a list of functions that can be easily called or updated as required.

def addition(a,b):
    return a+b

def subtraction(a,b):
    return a-b

def multiplication(a,b):
    return a*b

func_list = [addition, subtraction, multiplication]
print(func_list[0](3,2))
print(func_list[1](4,5))
print(func_list[2](6,4))
5
-1
24

Expand shortened pathnames expanduser

from os.path import expanduser
print(expanduser('~/my_org/journal'))
/Users/shreyas/my_org/journal

Lists

List comprehension

Source: bonus content videos of Real Python

# Example of directly defining a list using a shorthand for loop
import pandas as pd
print([x * x for x in range(10)])

# Basically the above is the same as:

squares = []
for x in range(10):
    squares.append(x * x)

print(squares)


# modifying the original code

squares2 = [x * x for x in range(10)]
print(squares2)

ModuleNotFoundErrorTraceback? (most recent call last)

ipython-input-3-eac809631b48
in <module> 1 # Example of directly defining a list using a shorthand for loop -—> 2 import pandas as pd 3 print([x * x for x in range(10)]) 4 5 # Basically the above is the same as:

ModuleNotFoundError?: No module named 'pandas'

Format: new_list = [expression for member in iterable if condition: ]

squares = [i * i for i in range(10)]
squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Format with a conditional and a loop. Start with the for loop and then add the conditional. The first keyword will be the item that is returned.

def filter_positive_even_numbers(numbers):
    """Receives a list of numbers, and returns a filtered list of only the
       numbers that are both positive and even (divisible by 2), try to use a
       list comprehension."""
       
    return [number for number in numbers if (number >0) and (number %2 == 0)] 

A list generator produces a list and a map produces a map object.

Using the assert method :assert:

Simple example

# Perform calculation only if a >= 100 and b < 200

a = 20
b = 200

def test_func(a,b):
    assert (a >=  100) &  (b <= 200)
    return (a + b)

print(test_func(a,b))
# Building on the same example as in the book

def apply_discount(product, discount, threshold):
    price = int(product['price'] * (1.0 * discount))
    assert threshold <= price <= product['price']
    return price

# Defining the product dictionary. One of the keys in the dictionary have to be price

shoes = {'name': 'adidas', 'price': 14900}

# Applying a 10% discount on shoes, and defining the threshold to be 1500
apply_discount(shoes, 10/100, threshold = 1500)
AssertionErrorTraceback (most recent call last)
<ipython-input-17-a8990c61cc1b> in <module>()
     11 shoes = {'name': 'adidas', 'price': 14900}
     12
---> 13 apply_discount(shoes, 10/100, threshold = 1500)

<ipython-input-17-a8990c61cc1b> in apply_discount(product, discount, threshold)
      4 def apply_discount(product, discount, threshold):
      5     price = int(product['price'] * (1.0 * discount))
----> 6     assert threshold <= price <= product['price']
      7     return price
      8

AssertionError:

Strings :strings:

General notes

Some methods associated with manipulating strings:

cap_me = "hello this is a normal string"
print(cap_me.upper())
print(cap_me.count('i'))
3_d = "hello"

the quotes surrounding a string are called delimiters, which tell python where the string starts and ends.

Notes on String formatting

#This is a comment that is is not interpreted.
first= 'monty'
second='python'
total= first + " " + second
print (total)
print (first,"", second)
#notice how an extra space is added in the 2nd line. This basically means that the space is added automatically.
print (first,second,'. This is corrected.')
monty python
monty  python
monty python . This is corrected.
print("test")

test

print("This is Mac's notebook")
#if i had used a single quote to enclose the string, then the string would have terminated at Mac's. Knowing this is useful with respect to string manipulation.
This is Mac's notebook
number = 1
string = str(number)
print (type (string))
print (type (number))
<class 'str'>
<class 'int'>
movie1 = "Clear and present danger"
movie2 = "tom dick and harry"
print ("My favorite movies \n \t", movie1, "\n \t", movie2)
My favorite movies
       Clear and present danger
       tom dick and harry

Notes on String manipulation

string1= 'Ragavan'
print(string1[0])
print(string1[4])
print(string1[3:]) #this is to print a range of the characters
R
v
avan
string1= 'Shreyas Ragavan'
print (type(len(string1)))
#note that the space is included as a count
string1= 'Shreyas Ragavan'
s2= string1[2:]
s3=string1[4:6]
s4=string1[2:5]
print (s2)
print (s3)
print (s4)
reyas Ragavan
ya
rey
s1="Ragavan"
s2="Jayanthi"
s1_5= len(s1)//2
s2_5= len(s2)//2
print (s1[s1_5:], s2[s2_5:])
<class 'int'>
avan nthi

Test program

word = 'Python'
first=word[0]
rest=word[1:]
result=rest + "-" + first +"y"
print (result)
ython-Py

Find your python version

    export PATH="/Users/shreyas/Applications/anaconda/bin:$PATH"
    export PATH="/Users/shreyas/anaconda_install/anaconda3/bin:$PATH"
source ~/.bash_profile

It is important to verify that the intended python interpreter version is being used.

Using the command line (shell):

python --version

Using the sys module in a python program.

import sys

print(sys.version_info)
print(sys.version)
sys.version_info(major=3, minor=6, micro=5, releaselevel='final', serial=0)
3.6.5 |Anaconda, Inc.| (default, Apr 26 2018, 08:42:37)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]

Finding the versions of scipy, numpy, pandas, scikit-learn, matplotlib

import scipy
import numpy
import matplotlib
import pandas
import sklearn

print('scipy: %s' %scipy.__version__)
print('Numpy: %s' %numpy.__version__)
print('Matplotlib: %s' %matplotlib.__version__)
print('Pandas: %s' %pandas.__version__)
print('scikit-learn: %s' %sklearn.__version__)

scipy: 1.2.0 Numpy: 1.15.4 Matplotlib: 3.0.2 Pandas: 0.24.1 scikit-learn: 0.20.2

Commenting in Python

Multi-line comments can be done with commenting out individual lines, or using the triple quotes. Depending on the location of hte multi-line comments - these triple quotes could turn into doctrings. When in doubt, add a # on the subsequent line.

# single line. This is called a block comment
# another single line

"""
This is a multiple line comment
and is enclosed within triple quotes.
I am using org mode in Emacs
"""

print("hello world")  # this is an inline comment

python debugger

Once you have an idea of where things might be breaking down, insert the following line of code into your script import pdb; pdb.set<sub>trace</sub>() and run it. This is the Python debugger and will drop you into interactive mode. The debugger can also be run from the command line with python -m pdb <my<sub>file.py</sub>>.
- <https://realpython.com/python-beginner-tips/>