Tags: technicalNotes, PythonNotes, DataScienceJournal

The following(incomplete) resources were utilised to develop the snippets and notes below. Other links are also available inline with the text, and I am working on using org-ref for a proper bibliography system (where possible).

The Mouse v/s The Python - Mike Driscoll's website
Real Python email newsletters, books, courses.
Howard Abram's video on literate dev-ops using Emacs, as well as his blog posts in general
Python cookbook : recipes for mastering Python 3.
Ted Petrou's courses on data science
pybites
Business Science Slack channel
Data36 blog posts and Slack channel

Platform independent shell commands like open

Note here the use of platform.system() to identify the platform. The subprocess.call command is used to launch a subprocess as a separate thread.

# Once the folder is created, I want to download pictures of cats
# for this I need a folder location that is available and then need a URL to download from
def display_cats(folder):
    # open folder
    print("Displaying cats in OS window.")
    if platform.system() == "Darwin":
        subprocess.call(["open", folder])
    elif platform.system() == "Windows":
        subprocess.call(["explorer", folder])
    elif platform.system() == "Linux":
        subprocess.call(["xdg-open", folder])
    else:
        print("We don't support your os: " + platform.system())

Python example functions to download a file via URL

'stream' argument with requests.get
shutil library to copy a file object into a binary file.

import shutil
import requests

def get_cat(
    folder,
    name,
    url="http://consuming-python-services-api.azurewebsites.net/cats/random",
):
    """This function will download data from a url and save the file. In
this case download cat pictures

    """
    url = url
    data = get_data_from_url(url)
    save_image(folder, name, data)


def get_data_from_url():
# Note the stream argument. This is required to store the actual file
# contents
    response = requests.get(url, stream=True)
    print("Status code is {}".format(response.status_code))
    return response.raw


def save_image(folder, name, data):
    file_name = os.path.join(folder, name, data) + ".jpg"
    with open(file_name, "wb") as f:
        shutil.copyfileobj(data, f)

Python truthiness

Empty lists, arrays, dictionaries, zero ints, zero floats, None, null, nil pointers are all deemed to be False in python

Conditional expression has lower precedence than other operators

Example: here the plus operators would be executed first before the conditional is evaluated.

'a' + 'x' if '123'.isdigit() else 'y' + 'b'

Running modules with python `-m`

How to Run Your Python Scripts – Real Python

The -m option searches sys.path for the module name and runs its content as __main__. Note: module-name needs to be the name of a module object, not a string.

Function can be defined and then assigned later

def func(x):
    return x + 1

f = func
print(f(2) + func(2))

String join is different from append

Note here how the result is papaya. The a is joined between each element of the list. i.e join is different from append.

The list to join is always the sole input to join(), which is called on the string you want to join with.

string = 'ppy!'
fruit = 'a'.join(list(string))
print(fruit)

papaya!

String to list

If a sting is converted to a list without any split parameters assigned, then each letter of the string is an item on the list. This makes it simple to work on the characters of a string as required, and each character can then be appended to a list to reconstruct the sentence.

Note then that the whitespaces between words are also captured and this is why the sentence can be reconstructed meaningfully.

text = "The Zen of Python, by Tim Peters"
print(list(text))

# Using a split command on a string will result in a list with the items
# decided based on the split character used.
print(text.split())

['T', 'h', 'e', ' ', 'Z', 'e', 'n', ' ', 'o', 'f', ' ', 'P', 'y', 't', 'h', 'o', 'n', ',', ' ', 'b', 'y', ' ', 'T', 'i', 'm', ' ', 'P', 'e', 't', 'e', 'r', 's']
['The', 'Zen', 'of', 'Python,', 'by', 'Tim', 'Peters']

Objects of type float should not be compared with the equality operator

Instead specify a tolerance and check the difference is within the tolerance

tolerance = 0.001
print(1.1 + 2.2 == 3.3)
comparison = abs((1.1 + 2.2) - 3.3) < tolerance
print(comparison)

False
True

Daily coding problem book

This section contains the solutions outlined in the daily coding problem book with my comments and notes and tasks for further exploration.

The idea to practice good elements of writing code, starting with doctrings and comments and assessing the order of time and space complexities.

Smallest window

Find the smallest window within an array on the path to sorting an array.

Note: A smart Loop initialization and choosing the right kind of variable is the key.

# method 1 use the sorted method available in python. This method is O(n log n)
def smallest_window(input_array):
    left, right = None, None
    s = sorted(input_array)
    # Now we know what the sorted array looks like. However this means
    # an extra complexity in space.

    for i in range(len(input_array)):
        # for the first iteration, left will be None. Therefore if the
        # item is not equal it will be assigned to left.  Here left or
        # right is not related to ascending or descending. THe logic is
        # simply deciding to move the number to ~some direction based on
        # an equality condition.
        if input_array[i] != s[i] and left is None:
            left = i
        elif input_array[i] != s[i]:
            right = i
    return left, right

# method 2 by traversing the array
def smallest_window_traverse(input_array):
    left, right = None, None
    n = len(input_array)
    max_seen, min_seen = -float("inf"), float("inf")

    for i in range(n):
        max_seen = max(max_seen, input_array[i])
        # here max_seen is the running maximum of the array. It starts
        # with - inf and the array[i]. Therefore array[i] will be
        # greater in general. THis is the new running maximum for
        # comparing the next element.
        if input_array[i] < max_seen:
            right = i

    for i in range(n-1, -1 , -1):
        min_seen = min(min_seen, input_array[i])
        if min_seen > input_array[i]:
            left = i
            # How is this the smallest window? The reason is the same
            # array is being traversed from both directions. The window
            # itself is created only if the numbers have to be moved at
            # all basd on whether the max or min criteria is shown to be
            # true.
    return left, right

a = [ 33,1, 444, 555,12,11, 666]
sorted_a = sorted(a)
print(sorted_a)
smallest_window(a)
b = -float("inf")
print(b)

[1, 11, 12, 33, 444, 555, 666] -inf

Maximum Subarray sum

Given an array calculate the maximum sum of any contiguous subarray

[X] Fix the indices for the brute force algorithm
- The inner loop has to travel one time more than the length of the array.
[ ] Why O(n3) for brute force?
- n traversing through the entire length for first loop
- n traversing again through entire length for 2nd loop
- sum (traversing repeatedly through the entire length)

# brute force of considering every subarray combination
def brute_force_subarray_sum(array):
    current_max = 0
    # the range for the outer loop has to be length -1 because it can
    # stop at the previous to last element. The inner loop will add the
    # last element to the previous to last element. If the outer loop
    # went up to the length of the array, then the final inner loop
    # iteration would be range(len(array), len(array)). As such this
    # makes no difference to results.
    for i in range(len(array)):
        for j in range(i, len(array)+1):
            current_max = max(current_max, sum(array[i:j]))
            print(sum(array[i:j]))
    return current_max


# Implement kadane's algorithm
def kadane_subarray_sum(array):
    maxSoFar, maxHere = 0, 0
    for item in array:
        maxHere = max(item, maxHere + item)
        print(f"Max at point is {maxHere}")
        maxSoFar = max(maxSoFar, maxHere)
        print(f"Max so far is {maxSoFar}")
    return maxSoFar

a = [1, 4 ,5, 6, 7]
print(f"Kadane max subarray is {kadane_subarray_sum(a)}")
print(f"Brute force subarray is {brute_force_subarray_sum(a)}")

Max at point is 1 Max so far is 1 Max at point is 5 Max so far is 5 Max at point is 10 Max so far is 10 Max at point is 16 Max so far is 16 Max at point is 23 Max so far is 23 Kadane max subarray is 23 0 1 5 10 16 23 0 4 9 15 22 0 5 11 18 0 6 13 0 7 Brute force subarray is 23

Array product manipulation

[X] devise a solution with division
[ ] Improve the solution made using division using list comprehension

def products(nums):
    """Given an input of numbers in an array - generate a new array wih the
    products of all the items in the input, excluding the item at the
    corresponding index. Division is not to be used.

    The methodology : for each item in the given array, another 2 arrays
    of corresponding prefix and suffix products are created. Then all
    that has to be done is multiply corresponding elements of the prefix
    and suffix for each element of the input provided.

    """
    # Generate prefix products.  Here prefix_products[-1] * num refers
    # to the previous entry in the list, which is already a product of
    # the /corresponding/ prefixes.  Also note here the way in which the
    # empty list is created and the conditional takes advantage of this
    # fact. This allows the very first element to stay in the loop.
    prefix_products = []
    for num in nums:
        if prefix_products:
            prefix_products.append(prefix_products[-1] * num)
        else:
            prefix_products.append(num)

    # Generate suffix products. One key here is in using the
    # reversed(). This enables the function between prefix and suffix to
    # stay largely the same. 
    suffix_products = []
    for num in reversed(nums):
        if suffix_products:
            suffix_products.append(suffix_products[-1] * num)
        else:
            suffix_products.append(num)
            # Note how the list is being reversed again to bring it i order
    suffix_products = list(reversed(suffix_products))

    # Generate the results from product of prefix and suffix
    results = []
    for i in range(len(nums)):
        # at the starting of the list, there is no prefix. Therefore
        # only a product of the suffix is required, and from the next
        # index.
        if i == 0:
            results.append(suffix_products[i + 1])
        # at the end of the list, there is no suffix, and hence only the
        # prefix products are required.
        elif i == len(nums) - 1:
            results.append(prefix_products[i - 1])
        else:
            results.append(prefix_products[i - 1] * suffix_products[i + 1])
    return results

Using Division

def products_elements_division(array):
    product = 1
    for i in range(len(array)):
        product = product * array[i]

    productOtherElements = []
    for i in range(len(array)):
        productOtherElements.append(product/array[i])
    return productOtherElements

a = [1,2,3,4]
products_elements_division(a)

Palindrome function

Note: Condition check for a palindrome using any permutation of the strings: Each character appears an even number of times allowing only one character to appear an odd number of times (middle character).

def is_palindrome(word):
    """Function to take in a word, convert it to lower case and check whether the word is a palindrome."""
    word =  word.lower()
    if word[-1::-1] == word:
        return "true"
    else: 
        return "false"

def is_palindrome(word):
    """Function to take in a word, convert it to lower case and check whether the word is a palindrome."""
    word =  word.lower()
    if word[-1::-1] == word:
        return "true"
    else: 
        return "false"
word = "Deleveled"
is_palindrome(word)

While, break and continue

This example from a free pybites exercise.

VALID_COLORS = ['blue', 'yellow', 'red']


def print_colors():
    """In the while loop ask the user to enter a color,
       lowercase it and store it in a variable. Next check:
       - if 'quit' was entered for color, print 'bye' and break.
       - if the color is not in VALID_COLORS, print 'Not a valid color' and continue.
       - otherwise print the color in lower case."""

    # The following loop will continue as long as the expression is true.
    while True:
        color = input("Enter color\n").lower()
        if color == "quit":
            print("bye")
            break
        if color not in VALID_COLORS:
            print("Not a valid color")
            continue

        print(color)
        pass

print_colors()

`f-strings` and `str.format()`

f-strings are a new way to print things in python 3.6. It builds on str.format(), and simplifies the syntax.

name = "Fred"
print(f"He said his name is {name}.")
# the print() is not necessary on the interpreter.
# The variable can be directly defined within the curly brackets without the need to call the format method

# The earlier str.format() option looked like this
name2 = "Astaire"
print("His name is {}, and his friend's name is {}".format(name, name2))

He said his name is Fred.
His name is Fred, and his friend's name is Astaire

General python overview

high level language (i.e compact syntax, and easy to learn)
Technically - python code is compile to bytecode, but not to machine code.
memory management and other aspects are handled internally. Low level languages like C do not.
Original development of python is cpython (written in C), and first released in 1991.
Dynamically types language (variables can change types after declaration)
considered an interpreted language as the code is compiled at runtime with cpython.

Syntax, `type` and operator notes

\\ : floor division
** : exponentiation
% : modulus operator
unary and binary operators
- Unary : -5 or +10
- binary : 5 -10
- the exponential operator has higher precedence than the unary operator.

        print(-5 ** 2)
        print((-5)**2)
        import sys
        print(sys.executable)
    
        -25
        25
        /home/shrysr/miniconda3/envs/min_ds/bin/python

The `not` operator takes precedence over `and` which takes precedence over `or`.
Augmented assignment
- `+=`, `-=`, `*=`, `/=`, `=`, `%=`, `**=`
- the variable to which this is being assigned must already be created.
id : every object created is stored in a specific location in memmory. This can be found using id. However, it is important to note that 'a' in the example below is simply a name that refers to the actual object. 'a' by itself is not an object.

a = 6
print(id(a))

94222151525568
94222151525664

Common Built-in object types are:
- int,
- bool,
- float,
- complex (by appending 'j').
- list
- dict
- tuple
- set
strings:
- sequence of characters
- Character: smallest possible component of text that can be printed with single keyboard press.
- a single character is a string of length 1
Encoding - UTF8/ASCII
- ASCII: represents 128 unique characters using 7 bits. 7 bits can encode 27 = 128 characters.
- bit : smallest unit of information for a computer.
- unicode: represents each character with 4 bytes. There are 8 bits per byte. This means each unicode encoding can represent 232 (4 billion unique characters)
- internally, each character is represented as an integer in python.
- More details can be found at link.

> &#x2026; a Unicode string is a sequence of code points, which are numbers from 0 through 0x10FFFF (1,114,111 decimal). This sequence of code points needs to be represented in memory as a set of code units, and code units are then mapped to 8-bit bytes. The rules for translating a Unicode string into a sequence of bytes are called a character encoding, or just an encoding.

Reasons for not using Python

No single programming language is 'always' the right choice.
Example: "..unlikely you're going to write a real-time operating system kernel in Python."
Example: unlikely that Python will be used to implement the next generation rendering engine.

Perenthesis to chain methods

Enclosing the entire command within parenthesis is necessary in python, because whitespace matters (in python).

# using parethenses to put methods on different lines
(s.rstrip('?!')
  .lower()
  .replace('t', 'a')
  .count('e'))

Pandas

Frequency of occurence `.value_counts()`

`groupby` : same as `group_by` in R

`agg()` : similar to `summarize` in R

ins.groupby('sex').agg({'charges': ['mean', 'max', 'count']}).round(0)

`pivot_table()` : comparison across groups made easier

pt = ins.pivot_table(index='sex', columns='region',
                     values='charges', aggfunc='mean').round(0)

Test example

ins = read_csv("~/my_projects/

DataFrame? and Series

DataFrame?: 2D, rectangular data structure Series: A single dimension of data. Similar to a single column of data, or a 1D array.

Concise notes of the pep 8 style guide

pep 8 :- python enhancement proposal 8.
the guide is not gospel. There may be situations where not following the style is more important. Therefore one should know when to be inconsistent.
Consistency within a module is most important. This is follow by the project and then the documentation.
The above is especially applicable when dealing with older code.
4 space per indentation
[ ] Find out what are hanging indents
[ ] Note the methods of closing brackets for multi-line components.
Limit all lines to a max of 79 characters
Code in the core Python distribution should always use UTF-8 (or ASCII in Python 2).
Files using ASCII (in Python 2) or UTF-8 (in Python 3) should not have an encoding declaration.
imports should be on separate lines.
However this is okay : from subprocess import Popen, PIPE
Imports are always put on top of a file
Imports should be grouped in the following order:
- Standard library imports.
- Related third party imports.
- Local application/library specific imports.
- You should put a blank line between each group of imports.
Module level "dunders" (i.e. names with two leading and two trailing underscores) such as all, author, version, etc. should be placed after the module docstring but before any import statements except from future imports. Python mandates that future-imports must appear in the module before any other code except docstrings.
use inline comments sparingly. Inline comments are comments that are on the same line as the statement
Functions and classes should be separated by 2 blank lines
continuations of long expressions onto additional lines should be indented by 4 extra spaces from their normal indentation level.

Naming conventions

Package and Module Names Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.

When an extension module written in C or C++ has an accompanying Python module that provides a higher level (e.g. more object oriented) interface, the C/C++ module has a leading underscore (e.g. \socket).

Class Names Class names should normally use the CapWords? convention.

The naming convention for functions may be used instead in cases where the interface is documented and used primarily as a callable.

Note that there is a separate convention for builtin names: most builtin names are single words (or two words run together), with the CapWords? convention used only for exception names and builtin constants.

Function and Variable Names Function names should be lowercase, with words separated by underscores as necessary to improve readability.

Variable names follow the same convention as function names.

Method Names and Instance Variables Use the function naming rules: lowercase with words separated by underscores as necessary to improve readability.

Use one leading underscore only for non-public methods and instance variables.

Constants Constants are usually defined on a module level and written in all capital letters with underscores separating words. Examples include MAXOVERFLOW and TOTAL.

Using try and except

It is possible to try out a set of instructions and then store the exceptions that crop up. The exceptions that are stored in python also have specific names. It is possible to use an exception to check for that particular name and store it in a variable.

try:
    schedule_file = open('schedule.txt', 'r')

except FileNotFoundError as err:
    print (err)

[Errno 2] No such file or directory: 'schedule.txt'

Notes on Functions

A list of functions / dictionaries :dictionary:

Source: Dan Bader @ Real Python

A function can be evaluated and returned to a particular variable in a list. This is essentially creating a list of functions that can be easily called or updated as required.

def addition(a,b):
    return a+b

def subtraction(a,b):
    return a-b

def multiplication(a,b):
    return a*b

func_list = [addition, subtraction, multiplication]
print(func_list[0](3,2))
print(func_list[1](4,5))
print(func_list[2](6,4))

5
-1
24

Expand shortened pathnames `expanduser`

from os.path import expanduser
print(expanduser('~/my_org/journal'))

/Users/shreyas/my_org/journal

Lists

List comprehension

Source: bonus content videos of Real Python

defining a list directly using a for loop.
shorthand for a regular for loop
shorthand for a regular loop and adding filtering

# Example of directly defining a list using a shorthand for loop
import pandas as pd
print([x * x for x in range(10)])

# Basically the above is the same as:

squares = []
for x in range(10):
    squares.append(x * x)

print(squares)


# modifying the original code

squares2 = [x * x for x in range(10)]
print(squares2)

ModuleNotFoundErrorTraceback? (most recent call last)

ipython-input-3-eac809631b48: in <module> 1 # Example of directly defining a list using a shorthand for loop -—> 2 import pandas as pd 3 print([x * x for x in range(10)]) 4 5 # Basically the above is the same as:

ModuleNotFoundError?: No module named 'pandas'

<https://www.programiz.com/python-programming/list-comprehension>

Format: new_list = [expression for member in iterable if condition: ]

squares = [i * i for i in range(10)]
squares

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Format with a conditional and a loop. Start with the for loop and then add the conditional. The first keyword will be the item that is returned.

def filter_positive_even_numbers(numbers):
    """Receives a list of numbers, and returns a filtered list of only the
       numbers that are both positive and even (divisible by 2), try to use a
       list comprehension."""
       
    return [number for number in numbers if (number >0) and (number %2 == 0)]

A list generator produces a list and a map produces a map object.

Using the assert method :assert:

Source: python tricks book
assert is used to check the existence of condition. The program will proceed if the condition is true. If not, an Assertion Error will be thrown up. This message can be customised.
assertions are internal self-checks and note meant as a communication to the user.
meant to inform the developer about unrecoverable errors in the program. This differentiates it from usual if-else conditionals
Aids in debugging.
Using an additional argument it is possible to provide a custom message.
Don't use asserts for data validation
- assertions can be disabled globally using the -o and -oo switches (and other techniques)
- When disabled - none of the conditional expressions will be evaluated.
- Therefore, never use an assert to check for admin privileges or such conditions. Remember the example of deleting a product from a product catalog.
Asserts that never fail

Simple example

# Perform calculation only if a >= 100 and b < 200

a = 20
b = 200

def test_func(a,b):
    assert (a >=  100) &  (b <= 200)
    return (a + b)

print(test_func(a,b))

# Building on the same example as in the book

def apply_discount(product, discount, threshold):
    price = int(product['price'] * (1.0 * discount))
    assert threshold <= price <= product['price']
    return price

# Defining the product dictionary. One of the keys in the dictionary have to be price

shoes = {'name': 'adidas', 'price': 14900}

# Applying a 10% discount on shoes, and defining the threshold to be 1500
apply_discount(shoes, 10/100, threshold = 1500)

AssertionErrorTraceback (most recent call last)
<ipython-input-17-a8990c61cc1b> in <module>()
     11 shoes = {'name': 'adidas', 'price': 14900}
     12
---> 13 apply_discount(shoes, 10/100, threshold = 1500)

<ipython-input-17-a8990c61cc1b> in apply_discount(product, discount, threshold)
      4 def apply_discount(product, discount, threshold):
      5     price = int(product['price'] * (1.0 * discount))
----> 6     assert threshold <= price <= product['price']
      7     return price
      8

AssertionError:

Strings :strings:

General notes

Some methods associated with manipulating strings:

upper()
count()

cap_me = "hello this is a normal string"
print(cap_me.upper())
print(cap_me.count('i'))
3_d = "hello"

the quotes surrounding a string are called delimiters, which tell python where the string starts and ends.

Notes on String formatting

Strings can be enclosed within a single or a double quote. Better to be consistent through the program.
Strings can be concatenated, and spaces can be included. Hash is used to include comments. Print command can be used to concatenate.

#This is a comment that is is not interpreted.
first= 'monty'
second='python'
total= first + " " + second
print (total)
print (first,"", second)
#notice how an extra space is added in the 2nd line. This basically means that the space is added automatically.
print (first,second,'. This is corrected.')

monty python
monty  python
monty python . This is corrected.

Double quote and single quote combination can be used for words that have an apostrophe

print("test")

test

print("This is Mac's notebook")
#if i had used a single quote to enclose the string, then the string would have terminated at Mac's. Knowing this is useful with respect to string manipulation.

This is Mac's notebook

Convert a number into a string, use the function str. Remember to concatenate numbers into strings before concatenating

number = 1
string = str(number)
print (type (string))
print (type (number))

<class 'str'>
<class 'int'>

Test program to explore formatting strings

movie1 = "Clear and present danger"
movie2 = "tom dick and harry"
print ("My favorite movies \n \t", movie1, "\n \t", movie2)

My favorite movies
       Clear and present danger
       tom dick and harry

Notes on String manipulation

Strings are actually a list of characters, to be treated as an array or a matrix. Therefore, a string[0], gives me the first character

string1= 'Ragavan'
print(string1[0])
print(string1[4])
print(string1[3:]) #this is to print a range of the characters

R
v
avan

Len() - for length of the string. Space is included as a count. This can be used to figure out the middle of the string.

string1= 'Shreyas Ragavan'
print (type(len(string1)))
#note that the space is included as a count

String Slice formula: variable[start:end+1]

string1= 'Shreyas Ragavan'
s2= string1[2:]
s3=string1[4:6]
s4=string1[2:5]
print (s2)
print (s3)
print (s4)

reyas Ragavan
ya
rey

Integer division can be used to round up divisions. Python 2 - single division and integer division are the same thing.

s1="Ragavan"
s2="Jayanthi"
s1_5= len(s1)//2
s2_5= len(s2)//2
print (s1[s1_5:], s2[s2_5:])

<class 'int'>
avan nthi

Test program

word = 'Python'
first=word[0]
rest=word[1:]
result=rest + "-" + first +"y"
print (result)

ython-Py

Find your python version

Note taken on [2018-07-16 Mon 11:17] The following is already added to .bash_profile when Anaconda is installed (Mac OS)
This is added by Anaconda 2.2.0 installer

    export PATH="/Users/shreyas/Applications/anaconda/bin:$PATH"

This is added by Anaconda3 5.1.0 installer

    export PATH="/Users/shreyas/anaconda_install/anaconda3/bin:$PATH"

To reload the profile:

source ~/.bash_profile

It is important to verify that the intended python interpreter version is being used.

Using the command line (shell):

python --version

Using the sys module in a python program.

import sys

print(sys.version_info)
print(sys.version)

sys.version_info(major=3, minor=6, micro=5, releaselevel='final', serial=0)
3.6.5 |Anaconda, Inc.| (default, Apr 26 2018, 08:42:37)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]

Finding the versions of scipy, numpy, pandas, scikit-learn, matplotlib

[ ] Write exception handling if the libraries are not installed.
[ ] include the python version, as well as a list of the virtual environments in specified location.
[ ]

import scipy
import numpy
import matplotlib
import pandas
import sklearn

print('scipy: %s' %scipy.__version__)
print('Numpy: %s' %numpy.__version__)
print('Matplotlib: %s' %matplotlib.__version__)
print('Pandas: %s' %pandas.__version__)
print('scikit-learn: %s' %sklearn.__version__)

scipy: 1.2.0 Numpy: 1.15.4 Matplotlib: 3.0.2 Pandas: 0.24.1 scikit-learn: 0.20.2

Commenting in Python

Multi-line comments can be done with commenting out individual lines, or using the triple quotes. Depending on the location of hte multi-line comments - these triple quotes could turn into doctrings. When in doubt, add a # on the subsequent line.

# single line. This is called a block comment
# another single line

"""
This is a multiple line comment
and is enclosed within triple quotes.
I am using org mode in Emacs
"""

print("hello world")  # this is an inline comment

python debugger

Once you have an idea of where things might be breaking down, insert the following line of code into your script import pdb; pdb.settrace() and run it. This is the Python debugger and will drop you into interactive mode. The debugger can also be run from the command line with python -m pdb <myfile.py>.

- <https://realpython.com/python-beginner-tips/>

The following resources were utilised to develop the snippets and notes below. Other links are also available inline with the text, and I am working on using org-ref for a proper bibliography system (where possible).

The Mouse v/s The Python - Mike Driscoll's website
Real Python email newsletters, books, courses.
Howard Abram's video on literate dev-ops using Emacs, as well as his blog posts in general
Python cookbook : recipes for mastering Python 3.
Ted Petrou's courses on data science
pybites
Business Science Slack channel
Data36 blog posts and Slack channel

Platform independent shell commands like open

Note here the use of platform.system() to identify the platform. The subprocess.call command is used to launch a subprocess as a separate thread.

# Once the folder is created, I want to download pictures of cats
# for this I need a folder location that is available and then need a URL to download from
def display_cats(folder):
    # open folder
    print("Displaying cats in OS window.")
    if platform.system() == "Darwin":
        subprocess.call(["open", folder])
    elif platform.system() == "Windows":
        subprocess.call(["explorer", folder])
    elif platform.system() == "Linux":
        subprocess.call(["xdg-open", folder])
    else:
        print("We don't support your os: " + platform.system())

Python example functions to download a file via URL

'stream' argument with requests.get
shutil library to copy a file object into a binary file.

import shutil
import requests

def get_cat(
    folder,
    name,
    url="http://consuming-python-services-api.azurewebsites.net/cats/random",
):
    """This function will download data from a url and save the file. In
this case download cat pictures

    """
    url = url
    data = get_data_from_url(url)
    save_image(folder, name, data)


def get_data_from_url():
# Note the stream argument. This is required to store the actual file
# contents
    response = requests.get(url, stream=True)
    print("Status code is {}".format(response.status_code))
    return response.raw


def save_image(folder, name, data):
    file_name = os.path.join(folder, name, data) + ".jpg"
    with open(file_name, "wb") as f:
        shutil.copyfileobj(data, f)

Python truthiness

Empty lists, arrays, dictionaries, zero ints, zero floats, None, null, nil pointers are all deemed to be False in python

Conditional expression has lower precedence than other operators

Example: here the plus operators would be executed first before the conditional is evaluated.

'a' + 'x' if '123'.isdigit() else 'y' + 'b'

Running modules with python `-m`

How to Run Your Python Scripts – Real Python

The -m option searches sys.path for the module name and runs its content as __main__. Note: module-name needs to be the name of a module object, not a string.

Function can be defined and then assigned later

def func(x):
    return x + 1

f = func
print(f(2) + func(2))

String join is different from append

Note here how the result is papaya. The a is joined between each element of the list. i.e join is different from append.

The list to join is always the sole input to join(), which is called on the string you want to join with.

string = 'ppy!'
fruit = 'a'.join(list(string))
print(fruit)

papaya!

String to list

Note then that the whitespaces between words are also captured and this is why the sentence can be reconstructed meaningfully.

text = "The Zen of Python, by Tim Peters"
print(list(text))

# Using a split command on a string will result in a list with the items
# decided based on the split character used.
print(text.split())

['T', 'h', 'e', ' ', 'Z', 'e', 'n', ' ', 'o', 'f', ' ', 'P', 'y', 't', 'h', 'o', 'n', ',', ' ', 'b', 'y', ' ', 'T', 'i', 'm', ' ', 'P', 'e', 't', 'e', 'r', 's']
['The', 'Zen', 'of', 'Python,', 'by', 'Tim', 'Peters']

Objects of type float should not be compared with the equality operator

Instead specify a tolerance and check the difference is within the tolerance

tolerance = 0.001
print(1.1 + 2.2 == 3.3)
comparison = abs((1.1 + 2.2) - 3.3) < tolerance
print(comparison)

False
True

Daily coding problem book

This section contains the solutions outlined in the daily coding problem book with my comments and notes and tasks for further exploration.

The idea to practice good elements of writing code, starting with doctrings and comments and assessing the order of time and space complexities.

Smallest window

Find the smallest window within an array on the path to sorting an array.

Note: A smart Loop initialization and choosing the right kind of variable is the key.

# method 1 use the sorted method available in python. This method is O(n log n)
def smallest_window(input_array):
    left, right = None, None
    s = sorted(input_array)
    # Now we know what the sorted array looks like. However this means
    # an extra complexity in space.

    for i in range(len(input_array)):
        # for the first iteration, left will be None. Therefore if the
        # item is not equal it will be assigned to left.  Here left or
        # right is not related to ascending or descending. THe logic is
        # simply deciding to move the number to ~some direction based on
        # an equality condition.
        if input_array[i] != s[i] and left is None:
            left = i
        elif input_array[i] != s[i]:
            right = i
    return left, right

# method 2 by traversing the array
def smallest_window_traverse(input_array):
    left, right = None, None
    n = len(input_array)
    max_seen, min_seen = -float("inf"), float("inf")

    for i in range(n):
        max_seen = max(max_seen, input_array[i])
        # here max_seen is the running maximum of the array. It starts
        # with - inf and the array[i]. Therefore array[i] will be
        # greater in general. THis is the new running maximum for
        # comparing the next element.
        if input_array[i] < max_seen:
            right = i

    for i in range(n-1, -1 , -1):
        min_seen = min(min_seen, input_array[i])
        if min_seen > input_array[i]:
            left = i
            # How is this the smallest window? The reason is the same
            # array is being traversed from both directions. The window
            # itself is created only if the numbers have to be moved at
            # all basd on whether the max or min criteria is shown to be
            # true.
    return left, right

a = [ 33,1, 444, 555,12,11, 666]
sorted_a = sorted(a)
print(sorted_a)
smallest_window(a)
b = -float("inf")
print(b)

[1, 11, 12, 33, 444, 555, 666] -inf

Maximum Subarray sum

Given an array calculate the maximum sum of any contiguous subarray

[X] Fix the indices for the brute force algorithm
- The inner loop has to travel one time more than the length of the array.
[ ] Why O(n3) for brute force?
- n traversing through the entire length for first loop
- n traversing again through entire length for 2nd loop
- sum (traversing repeatedly through the entire length)

# brute force of considering every subarray combination
def brute_force_subarray_sum(array):
    current_max = 0
    # the range for the outer loop has to be length -1 because it can
    # stop at the previous to last element. The inner loop will add the
    # last element to the previous to last element. If the outer loop
    # went up to the length of the array, then the final inner loop
    # iteration would be range(len(array), len(array)). As such this
    # makes no difference to results.
    for i in range(len(array)):
        for j in range(i, len(array)+1):
            current_max = max(current_max, sum(array[i:j]))
            print(sum(array[i:j]))
    return current_max


# Implement kadane's algorithm
def kadane_subarray_sum(array):
    maxSoFar, maxHere = 0, 0
    for item in array:
        maxHere = max(item, maxHere + item)
        print(f"Max at point is {maxHere}")
        maxSoFar = max(maxSoFar, maxHere)
        print(f"Max so far is {maxSoFar}")
    return maxSoFar

a = [1, 4 ,5, 6, 7]
print(f"Kadane max subarray is {kadane_subarray_sum(a)}")
print(f"Brute force subarray is {brute_force_subarray_sum(a)}")

Array product manipulation

[X] devise a solution with division
[ ] Improve the solution made using division using list comprehension

def products(nums):
    """Given an input of numbers in an array - generate a new array wih the
    products of all the items in the input, excluding the item at the
    corresponding index. Division is not to be used.

    The methodology : for each item in the given array, another 2 arrays
    of corresponding prefix and suffix products are created. Then all
    that has to be done is multiply corresponding elements of the prefix
    and suffix for each element of the input provided.

    """
    # Generate prefix products.  Here prefix_products[-1] * num refers
    # to the previous entry in the list, which is already a product of
    # the /corresponding/ prefixes.  Also note here the way in which the
    # empty list is created and the conditional takes advantage of this
    # fact. This allows the very first element to stay in the loop.
    prefix_products = []
    for num in nums:
        if prefix_products:
            prefix_products.append(prefix_products[-1] * num)
        else:
            prefix_products.append(num)

    # Generate suffix products. One key here is in using the
    # reversed(). This enables the function between prefix and suffix to
    # stay largely the same. 
    suffix_products = []
    for num in reversed(nums):
        if suffix_products:
            suffix_products.append(suffix_products[-1] * num)
        else:
            suffix_products.append(num)
            # Note how the list is being reversed again to bring it i order
    suffix_products = list(reversed(suffix_products))

    # Generate the results from product of prefix and suffix
    results = []
    for i in range(len(nums)):
        # at the starting of the list, there is no prefix. Therefore
        # only a product of the suffix is required, and from the next
        # index.
        if i == 0:
            results.append(suffix_products[i + 1])
        # at the end of the list, there is no suffix, and hence only the
        # prefix products are required.
        elif i == len(nums) - 1:
            results.append(prefix_products[i - 1])
        else:
            results.append(prefix_products[i - 1] * suffix_products[i + 1])
    return results

Using Division

def products_elements_division(array):
    product = 1
    for i in range(len(array)):
        product = product * array[i]

    productOtherElements = []
    for i in range(len(array)):
        productOtherElements.append(product/array[i])
    return productOtherElements

a = [1,2,3,4]
products_elements_division(a)

Palindrome function

def is_palindrome(word):
    """Function to take in a word, convert it to lower case and check whether the word is a palindrome."""
    word =  word.lower()
    if word[-1::-1] == word:
        return "true"
    else: 
        return "false"

def is_palindrome(word):
    """Function to take in a word, convert it to lower case and check whether the word is a palindrome."""
    word =  word.lower()
    if word[-1::-1] == word:
        return "true"
    else: 
        return "false"
word = "Deleveled"
is_palindrome(word)

While, break and continue

This example from a free pybites exercise.

VALID_COLORS = ['blue', 'yellow', 'red']


def print_colors():
    """In the while loop ask the user to enter a color,
       lowercase it and store it in a variable. Next check:
       - if 'quit' was entered for color, print 'bye' and break.
       - if the color is not in VALID_COLORS, print 'Not a valid color' and continue.
       - otherwise print the color in lower case."""

    # The following loop will continue as long as the expression is true.
    while True:
        color = input("Enter color\n").lower()
        if color == "quit":
            print("bye")
            break
        if color not in VALID_COLORS:
            print("Not a valid color")
            continue

        print(color)
        pass

print_colors()

`f-strings` and `str.format()`

f-strings are a new way to print things in python 3.6. It builds on str.format(), and simplifies the syntax.

name = "Fred"
print(f"He said his name is {name}.")
# the print() is not necessary on the interpreter.
# The variable can be directly defined within the curly brackets without the need to call the format method

# The earlier str.format() option looked like this
name2 = "Astaire"
print("His name is {}, and his friend's name is {}".format(name, name2))

He said his name is Fred.
His name is Fred, and his friend's name is Astaire

General python overview

high level language (i.e compact syntax, and easy to learn)
Technically - python code is compile to bytecode, but not to machine code.
memory management and other aspects are handled internally. Low level languages like C do not.
Original development of python is cpython (written in C), and first released in 1991.
Dynamically types language (variables can change types after declaration)
considered an interpreted language as the code is compiled at runtime with cpython.

Syntax, `type` and operator notes

\\ : floor division
** : exponentiation
% : modulus operator
unary and binary operators
- Unary : -5 or +10
- binary : 5 -10
- the exponential operator has higher precedence than the unary operator.

        print(-5 ** 2)
        print((-5)**2)
        import sys
        print(sys.executable)
    
        -25
        25
        /home/shrysr/miniconda3/envs/min_ds/bin/python

The `not` operator takes precedence over `and` which takes precedence over `or`.
Augmented assignment
- `+=`, `-=`, `*=`, `/=`, `=`, `%=`, `**=`
- the variable to which this is being assigned must already be created.
id : every object created is stored in a specific location in memmory. This can be found using id. However, it is important to note that 'a' in the example below is simply a name that refers to the actual object. 'a' by itself is not an object.

a = 6
print(id(a))

94222151525568
94222151525664

Common Built-in object types are:
- int,
- bool,
- float,
- complex (by appending 'j').
- list
- dict
- tuple
- set
strings:
- sequence of characters
- Character: smallest possible component of text that can be printed with single keyboard press.
- a single character is a string of length 1
Encoding - UTF8/ASCII
- ASCII: represents 128 unique characters using 7 bits. 7 bits can encode 27 = 128 characters.
- bit : smallest unit of information for a computer.
- unicode: represents each character with 4 bytes. There are 8 bits per byte. This means each unicode encoding can represent 232 (4 billion unique characters)
- internally, each character is represented as an integer in python.
- More details can be found at link.

> &#x2026; a Unicode string is a sequence of code points, which are numbers from 0 through 0x10FFFF (1,114,111 decimal). This sequence of code points needs to be represented in memory as a set of code units, and code units are then mapped to 8-bit bytes. The rules for translating a Unicode string into a sequence of bytes are called a character encoding, or just an encoding.

Reasons for not using Python

No single programming language is 'always' the right choice.
Example: "..unlikely you're going to write a real-time operating system kernel in Python."
Example: unlikely that Python will be used to implement the next generation rendering engine.

Perenthesis to chain methods

Enclosing the entire command within parenthesis is necessary in python, because whitespace matters (in python).

# using parethenses to put methods on different lines
(s.rstrip('?!')
  .lower()
  .replace('t', 'a')
  .count('e'))

Pandas

Frequency of occurence `.value_counts()`

`groupby` : same as `group_by` in R

`agg()` : similar to `summarize` in R

ins.groupby('sex').agg({'charges': ['mean', 'max', 'count']}).round(0)

`pivot_table()` : comparison across groups made easier

pt = ins.pivot_table(index='sex', columns='region',
                     values='charges', aggfunc='mean').round(0)

Test example

ins = read_csv("~/my_projects/

DataFrame? and Series

DataFrame?: 2D, rectangular data structure Series: A single dimension of data. Similar to a single column of data, or a 1D array.

Concise notes of the pep 8 style guide

pep 8 :- python enhancement proposal 8.
the guide is not gospel. There may be situations where not following the style is more important. Therefore one should know when to be inconsistent.
Consistency within a module is most important. This is follow by the project and then the documentation.
The above is especially applicable when dealing with older code.
4 space per indentation
[ ] Find out what are hanging indents
[ ] Note the methods of closing brackets for multi-line components.
Limit all lines to a max of 79 characters
Code in the core Python distribution should always use UTF-8 (or ASCII in Python 2).
Files using ASCII (in Python 2) or UTF-8 (in Python 3) should not have an encoding declaration.
imports should be on separate lines.
However this is okay : from subprocess import Popen, PIPE
Imports are always put on top of a file
Imports should be grouped in the following order:
- Standard library imports.
- Related third party imports.
- Local application/library specific imports.
- You should put a blank line between each group of imports.
Module level "dunders" (i.e. names with two leading and two trailing underscores) such as all, author, version, etc. should be placed after the module docstring but before any import statements except from future imports. Python mandates that future-imports must appear in the module before any other code except docstrings.
use inline comments sparingly. Inline comments are comments that are on the same line as the statement
Functions and classes should be separated by 2 blank lines
continuations of long expressions onto additional lines should be indented by 4 extra spaces from their normal indentation level.

Naming conventions

Class Names Class names should normally use the CapWords? convention.

The naming convention for functions may be used instead in cases where the interface is documented and used primarily as a callable.

Function and Variable Names Function names should be lowercase, with words separated by underscores as necessary to improve readability.

Variable names follow the same convention as function names.

Method Names and Instance Variables Use the function naming rules: lowercase with words separated by underscores as necessary to improve readability.

Use one leading underscore only for non-public methods and instance variables.

Constants Constants are usually defined on a module level and written in all capital letters with underscores separating words. Examples include MAXOVERFLOW and TOTAL.

Using try and except

try:
    schedule_file = open('schedule.txt', 'r')

except FileNotFoundError as err:
    print (err)

[Errno 2] No such file or directory: 'schedule.txt'

Notes on Functions

A list of functions / dictionaries :dictionary:

Source: Dan Bader @ Real Python

A function can be evaluated and returned to a particular variable in a list. This is essentially creating a list of functions that can be easily called or updated as required.

def addition(a,b):
    return a+b

def subtraction(a,b):
    return a-b

def multiplication(a,b):
    return a*b

func_list = [addition, subtraction, multiplication]
print(func_list[0](3,2))
print(func_list[1](4,5))
print(func_list[2](6,4))

5
-1
24

Expand shortened pathnames `expanduser`

from os.path import expanduser
print(expanduser('~/my_org/journal'))

/Users/shreyas/my_org/journal

Lists

List comprehension

Source: bonus content videos of Real Python

defining a list directly using a for loop.
shorthand for a regular for loop
shorthand for a regular loop and adding filtering

# Example of directly defining a list using a shorthand for loop
import pandas as pd
print([x * x for x in range(10)])

# Basically the above is the same as:

squares = []
for x in range(10):
    squares.append(x * x)

print(squares)


# modifying the original code

squares2 = [x * x for x in range(10)]
print(squares2)

ModuleNotFoundErrorTraceback? (most recent call last)

ipython-input-3-eac809631b48: in <module> 1 # Example of directly defining a list using a shorthand for loop -—> 2 import pandas as pd 3 print([x * x for x in range(10)]) 4 5 # Basically the above is the same as:

ModuleNotFoundError?: No module named 'pandas'

<https://www.programiz.com/python-programming/list-comprehension>

Format: new_list = [expression for member in iterable if condition: ]

squares = [i * i for i in range(10)]
squares

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Format with a conditional and a loop. Start with the for loop and then add the conditional. The first keyword will be the item that is returned.

def filter_positive_even_numbers(numbers):
    """Receives a list of numbers, and returns a filtered list of only the
       numbers that are both positive and even (divisible by 2), try to use a
       list comprehension."""
       
    return [number for number in numbers if (number >0) and (number %2 == 0)]

A list generator produces a list and a map produces a map object.

Using the assert method :assert:

Source: python tricks book
assert is used to check the existence of condition. The program will proceed if the condition is true. If not, an Assertion Error will be thrown up. This message can be customised.
assertions are internal self-checks and note meant as a communication to the user.
meant to inform the developer about unrecoverable errors in the program. This differentiates it from usual if-else conditionals
Aids in debugging.
Using an additional argument it is possible to provide a custom message.
Don't use asserts for data validation
- assertions can be disabled globally using the -o and -oo switches (and other techniques)
- When disabled - none of the conditional expressions will be evaluated.
- Therefore, never use an assert to check for admin privileges or such conditions. Remember the example of deleting a product from a product catalog.
Asserts that never fail

Simple example

# Perform calculation only if a >= 100 and b < 200

a = 20
b = 200

def test_func(a,b):
    assert (a >=  100) &  (b <= 200)
    return (a + b)

print(test_func(a,b))

# Building on the same example as in the book

def apply_discount(product, discount, threshold):
    price = int(product['price'] * (1.0 * discount))
    assert threshold <= price <= product['price']
    return price

# Defining the product dictionary. One of the keys in the dictionary have to be price

shoes = {'name': 'adidas', 'price': 14900}

# Applying a 10% discount on shoes, and defining the threshold to be 1500
apply_discount(shoes, 10/100, threshold = 1500)

AssertionErrorTraceback (most recent call last)
<ipython-input-17-a8990c61cc1b> in <module>()
     11 shoes = {'name': 'adidas', 'price': 14900}
     12
---> 13 apply_discount(shoes, 10/100, threshold = 1500)

<ipython-input-17-a8990c61cc1b> in apply_discount(product, discount, threshold)
      4 def apply_discount(product, discount, threshold):
      5     price = int(product['price'] * (1.0 * discount))
----> 6     assert threshold <= price <= product['price']
      7     return price
      8

AssertionError:

Strings :strings:

General notes

Some methods associated with manipulating strings:

upper()
count()

cap_me = "hello this is a normal string"
print(cap_me.upper())
print(cap_me.count('i'))
3_d = "hello"

the quotes surrounding a string are called delimiters, which tell python where the string starts and ends.

Notes on String formatting

Strings can be enclosed within a single or a double quote. Better to be consistent through the program.
Strings can be concatenated, and spaces can be included. Hash is used to include comments. Print command can be used to concatenate.

#This is a comment that is is not interpreted.
first= 'monty'
second='python'
total= first + " " + second
print (total)
print (first,"", second)
#notice how an extra space is added in the 2nd line. This basically means that the space is added automatically.
print (first,second,'. This is corrected.')

monty python
monty  python
monty python . This is corrected.

Double quote and single quote combination can be used for words that have an apostrophe

print("test")

test

print("This is Mac's notebook")
#if i had used a single quote to enclose the string, then the string would have terminated at Mac's. Knowing this is useful with respect to string manipulation.

This is Mac's notebook

Convert a number into a string, use the function str. Remember to concatenate numbers into strings before concatenating

number = 1
string = str(number)
print (type (string))
print (type (number))

<class 'str'>
<class 'int'>

Test program to explore formatting strings

movie1 = "Clear and present danger"
movie2 = "tom dick and harry"
print ("My favorite movies \n \t", movie1, "\n \t", movie2)

My favorite movies
       Clear and present danger
       tom dick and harry

Notes on String manipulation

Strings are actually a list of characters, to be treated as an array or a matrix. Therefore, a string[0], gives me the first character

string1= 'Ragavan'
print(string1[0])
print(string1[4])
print(string1[3:]) #this is to print a range of the characters

R
v
avan

Len() - for length of the string. Space is included as a count. This can be used to figure out the middle of the string.

string1= 'Shreyas Ragavan'
print (type(len(string1)))
#note that the space is included as a count

String Slice formula: variable[start:end+1]

string1= 'Shreyas Ragavan'
s2= string1[2:]
s3=string1[4:6]
s4=string1[2:5]
print (s2)
print (s3)
print (s4)

reyas Ragavan
ya
rey

Integer division can be used to round up divisions. Python 2 - single division and integer division are the same thing.

s1="Ragavan"
s2="Jayanthi"
s1_5= len(s1)//2
s2_5= len(s2)//2
print (s1[s1_5:], s2[s2_5:])

<class 'int'>
avan nthi

Test program

word = 'Python'
first=word[0]
rest=word[1:]
result=rest + "-" + first +"y"
print (result)

ython-Py

Find your python version

Note taken on [2018-07-16 Mon 11:17] The following is already added to .bash_profile when Anaconda is installed (Mac OS)
This is added by Anaconda 2.2.0 installer

    export PATH="/Users/shreyas/Applications/anaconda/bin:$PATH"

This is added by Anaconda3 5.1.0 installer

    export PATH="/Users/shreyas/anaconda_install/anaconda3/bin:$PATH"

To reload the profile:

source ~/.bash_profile

It is important to verify that the intended python interpreter version is being used.

Using the command line (shell):

python --version

Using the sys module in a python program.

import sys

print(sys.version_info)
print(sys.version)

sys.version_info(major=3, minor=6, micro=5, releaselevel='final', serial=0)
3.6.5 |Anaconda, Inc.| (default, Apr 26 2018, 08:42:37)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]

Finding the versions of scipy, numpy, pandas, scikit-learn, matplotlib

[ ] Write exception handling if the libraries are not installed.
[ ] include the python version, as well as a list of the virtual environments in specified location.
[ ]

import scipy
import numpy
import matplotlib
import pandas
import sklearn

print('scipy: %s' %scipy.__version__)
print('Numpy: %s' %numpy.__version__)
print('Matplotlib: %s' %matplotlib.__version__)
print('Pandas: %s' %pandas.__version__)
print('scikit-learn: %s' %sklearn.__version__)

scipy: 1.2.0 Numpy: 1.15.4 Matplotlib: 3.0.2 Pandas: 0.24.1 scikit-learn: 0.20.2

Commenting in Python

# single line. This is called a block comment
# another single line

"""
This is a multiple line comment
and is enclosed within triple quotes.
I am using org mode in Emacs
"""

print("hello world")  # this is an inline comment

python debugger

Once you have an idea of where things might be breaking down, insert the following line of code into your script import pdb; pdb.settrace() and run it. This is the Python debugger and will drop you into interactive mode. The debugger can also be run from the command line with python -m pdb <myfile.py>.

- <https://realpython.com/python-beginner-tips/>

python notes

Platform independent shell commands like open

Python example functions to download a file via URL

Python truthiness

Conditional expression has lower precedence than other operators

Running modules with python -m

Function can be defined and then assigned later

String join is different from append

String to list

Objects of type float should not be compared with the equality operator

Daily coding problem book

Smallest window

Maximum Subarray sum

Array product manipulation

Palindrome function

While, break and continue

f-strings and str.format()

General python overview

Syntax, type and operator notes

Reasons for not using Python

Perenthesis to chain methods

Pandas

Frequency of occurence .value_counts()

groupby : same as group_by in R

agg() : similar to summarize in R

pivot_table() : comparison across groups made easier

Test example

DataFrame? and Series

Concise notes of the pep 8 style guide

Naming conventions

Using try and except

Notes on Functions

A list of functions / dictionaries :dictionary:

Expand shortened pathnames expanduser

Lists

List comprehension

Using the assert method :assert:

Strings :strings:

General notes

Notes on String formatting

Notes on String manipulation

Test program

Find your python version

Finding the versions of scipy, numpy, pandas, scikit-learn, matplotlib

Commenting in Python

python debugger

Platform independent shell commands like open

Python example functions to download a file via URL

Python truthiness

Conditional expression has lower precedence than other operators

Running modules with python -m

Function can be defined and then assigned later

String join is different from append

String to list

Objects of type float should not be compared with the equality operator

Daily coding problem book

Smallest window

Maximum Subarray sum

Array product manipulation

Palindrome function

While, break and continue

f-strings and str.format()

General python overview

Syntax, type and operator notes

Reasons for not using Python

Perenthesis to chain methods

Pandas

Frequency of occurence .value_counts()

groupby : same as group_by in R

agg() : similar to summarize in R

pivot_table() : comparison across groups made easier

Test example

DataFrame? and Series

Concise notes of the pep 8 style guide

Naming conventions

Using try and except

Notes on Functions

A list of functions / dictionaries :dictionary:

Expand shortened pathnames expanduser

Lists

Running modules with python `-m`

`f-strings` and `str.format()`

Syntax, `type` and operator notes

Frequency of occurence `.value_counts()`

`groupby` : same as `group_by` in R

`agg()` : similar to `summarize` in R

`pivot_table()` : comparison across groups made easier

Expand shortened pathnames `expanduser`

Running modules with python `-m`

`f-strings` and `str.format()`

Syntax, `type` and operator notes

Frequency of occurence `.value_counts()`

`groupby` : same as `group_by` in R

`agg()` : similar to `summarize` in R

`pivot_table()` : comparison across groups made easier

Expand shortened pathnames `expanduser`